From forax at univ-mlv.fr Fri Mar 2 14:30:01 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 2 Mar 2018 15:30:01 +0100 (CET) Subject: Disallowing break label (and continue label) inside an expression switch Message-ID: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> Hi all, as far as i remember, the current idea to differentiate between a break label and a break value is to let the compiler figure this out, i wonder if it's not simpler to disallow break label (and continue label) inside an expression switch. After all, an expression switch do not exist yet, so no backward compatibility issue, it may make some refactoring impossible but had the great advantage to do not allow a lot of puzzler codes like the one below. enum Result { ONE, MANY } Result result(String[] args) { ONE: for(String s: args) { return switch(s) { case "several": case "many": break MANY; case "one": break ONE; default: continue; }; } throw ...; } R?mi From brian.goetz at oracle.com Fri Mar 2 16:12:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 2 Mar 2018 11:12:16 -0500 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> Message-ID: <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> Thanks for bringing this up.? I remember it being discussed once before, but I don't think we acted on it. I agree that expression switch is an expression, and it should either yield a value or throw something; breaking out of the middle of an expression is not something we have, nor does it seem necessary. (Though I'm sure clever folks could come up with a good example where it would be convenient.) A sensible extension of this is no "return" from a switch expression either: ??? int foo(int x) { ??????? return switch (x) { ??????????? case 1 -> 2; ??????????? case 2 -> 4; ??????????? case 3 -> 8; ??????????? default: return Integer.MAX_VALUE; ??????? } ??? } Like conditionals, then, switch expressions would either yield a value (through breaking) or throw.? This seems consistent, but...what happens when we nest a statement in a switch expression? ??? void foo(int x, int y, int z) { ??????? TOP: ??????? switch (z) { ??????????? case 1: ? ? ? ?? ?????? int i = switch (x) { ??????? ? ? ? ?? ?? case 1 -> 2; ? ? ? ?? ?????????? case 2: ??????? ? ? ? ?? ?????? switch (y) { ??????????????? ? ? ? ?? ?? case 1: return; ??????????????????????????? default: break TOP; ??????????????????????? } ??????????????? } ??????? } ??? } Do we disallow the "break TOP" and return in the inner switch? IOW, does the expression form a barrier through which control can only pass via break or exceptions? On 3/2/2018 9:30 AM, Remi Forax wrote: > Hi all, > as far as i remember, the current idea to differentiate between a break label and a break value is to let the compiler figure this out, > i wonder if it's not simpler to disallow break label (and continue label) inside an expression switch. > > After all, an expression switch do not exist yet, so no backward compatibility issue, it may make some refactoring impossible but had the great advantage to do not allow a lot of puzzler codes like the one below. > > enum Result { > ONE, MANY > } > > Result result(String[] args) { > ONE: for(String s: args) { > return switch(s) { > case "several": > case "many": > break MANY; > case "one": > break ONE; > default: > continue; > }; > } > throw ...; > } > > R?mi From kevinb at google.com Fri Mar 2 16:40:05 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 2 Mar 2018 08:40:05 -0800 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> Message-ID: I would very much favor constraining what can be done inside an expression switch as sharply as we can. In fact, if we were designing the language from scratch, all at the same time, but having already decided the behavior of the conditional operators ?: ... might we not logically design this as similarly as possible to that? // compare this two-way choice return status == ABC ? something() : somethingElse(); // to this three-way choice return status *??* { ABC -> something(); DEF, XYZ -> somethingElse(); default -> somethingOther(); } (stand-in syntax only) You could ideally think of ?: as sugar for the latter. Might we possibly be better off thinking of it this way, and not even worrying about multi-statement cases (you can always extract a method)? It seems like several of our woes disappear. Something to consider is whether we want a construct that readability-conscious developers will use *only* as the object of a return statement or variable assignment (which is what I think we have, currently), or one that they might comfortably use in-line in other circumstances, like Result r = methodCall( param1, param2, status *??* { ABC -> something(); DEF, XYZ -> somethingElse(); default -> somethingOther(); }); I can see that usage at least being debatable, while I suspect we might frown on it all spelled out with `switch` and `case`....? As a last point in its favor, `null` can be treated completely normally - if you didn't list `null ->` or `ABC, null ->`, etc., then use the default. On Fri, Mar 2, 2018 at 8:12 AM, Brian Goetz wrote: > Thanks for bringing this up. I remember it being discussed once before, > but I don't think we acted on it. > > I agree that expression switch is an expression, and it should either > yield a value or throw something; breaking out of the middle of an > expression is not something we have, nor does it seem necessary. (Though > I'm sure clever folks could come up with a good example where it would be > convenient.) > > A sensible extension of this is no "return" from a switch expression > either: > > int foo(int x) { > return switch (x) { > case 1 -> 2; > case 2 -> 4; > case 3 -> 8; > default: return Integer.MAX_VALUE; > } > } > > Like conditionals, then, switch expressions would either yield a value > (through breaking) or throw. This seems consistent, but...what happens > when we nest a statement in a switch expression? > > void foo(int x, int y, int z) { > TOP: > switch (z) { > case 1: > int i = switch (x) { > case 1 -> 2; > case 2: > switch (y) { > case 1: return; > default: break TOP; > } > } > } > } > > Do we disallow the "break TOP" and return in the inner switch? IOW, does > the expression form a barrier through which control can only pass via break > or exceptions? > > > > > > > On 3/2/2018 9:30 AM, Remi Forax wrote: > >> Hi all, >> as far as i remember, the current idea to differentiate between a break >> label and a break value is to let the compiler figure this out, >> i wonder if it's not simpler to disallow break label (and continue label) >> inside an expression switch. >> >> After all, an expression switch do not exist yet, so no backward >> compatibility issue, it may make some refactoring impossible but had the >> great advantage to do not allow a lot of puzzler codes like the one below. >> >> enum Result { >> ONE, MANY >> } >> >> Result result(String[] args) { >> ONE: for(String s: args) { >> return switch(s) { >> case "several": >> case "many": >> break MANY; >> case "one": >> break ONE; >> default: >> continue; >> }; >> } >> throw ...; >> } >> >> R?mi >> > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 2 19:45:52 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 2 Mar 2018 11:45:52 -0800 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> Message-ID: <552686D5-3197-4717-A1BA-B2182B89908E@oracle.com> On Mar 2, 2018, at 8:12 AM, Brian Goetz wrote: > > Do we disallow the "break TOP" and return in the inner switch? > IOW, does the expression form a barrier through which control > can only pass via break or exceptions? IIRC last time we talked about this that was the consensus. It's consistent with what Kevin just wrote also, about constraining the result of a switch-expr. We currently don't have any expressions which can branch, just return normally or throw. Let's keep it that way. (Remember we had an option for lambdas to branch outside the lambda expression, and we decisively shut it down for various reasons.) We *do* have expressions which can *internally branch* without completing. Those are lambdas. Switch expressions should also be able to have arbitrary control flow inside (if the coder decides). But that "barrier" idea expresses the key constraint: That the branch labels used inside the switch are operationally disjoint from those outside. (Separately, as a syntax constraint, labels enclosing the same bit of syntax should be distinct. But that doesn't mean an outer label is ever reachable from inner control flow. Trying to reach across the barrier must be an error.) If someone wants to write code like Remi's puzzlers, fine, but they have to refactor to a statement-switch, and assign the result to a temp, like today. Statements can branch to locations other than their standard completion point. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 2 20:06:35 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 2 Mar 2018 21:06:35 +0100 (CET) Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> Message-ID: <1267188026.1909904.1520021195387.JavaMail.zimbra@u-pem.fr> Hi Kevin, i've already proposed to remove 'case' (using match instead of ??) see http://mail.openjdk.java.net/pipermail/amber-spec-experts/2017-December/000211.html and http://mail.openjdk.java.net/pipermail/amber-spec-experts/2017-December/000232.html but Brian did not like it :) R?mi > De: "Kevin Bourrillion" > ?: "Brian Goetz" > Cc: "Remi Forax" , "amber-spec-experts" > > Envoy?: Vendredi 2 Mars 2018 17:40:05 > Objet: Re: Disallowing break label (and continue label) inside an expression > switch > I would very much favor constraining what can be done inside an expression > switch as sharply as we can. > In fact, if we were designing the language from scratch, all at the same time, > but having already decided the behavior of the conditional operators ?: ... > might we not logically design this as similarly as possible to that? > // compare this two-way choice > return status == ABC ? something() : somethingElse(); > // to this three-way choice > return status ?? { ABC -> something(); DEF, XYZ -> somethingElse(); default -> > somethingOther(); } > (stand-in syntax only) > You could ideally think of ?: as sugar for the latter. Might we possibly be > better off thinking of it this way, and not even worrying about multi-statement > cases (you can always extract a method)? It seems like several of our woes > disappear. > Something to consider is whether we want a construct that readability-conscious > developers will use only as the object of a return statement or variable > assignment (which is what I think we have, currently), or one that they might > comfortably use in-line in other circumstances, like > Result r = methodCall( > param1, > param2, > status ?? { > ABC -> something(); > DEF, XYZ -> somethingElse(); > default -> somethingOther(); > }); > I can see that usage at least being debatable, while I suspect we might frown on > it all spelled out with `switch` and `case`....? > As a last point in its favor, `null` can be treated completely normally - if you > didn't list `null ->` or `ABC, null ->`, etc., then use the default. > On Fri, Mar 2, 2018 at 8:12 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> Thanks for bringing this up. I remember it being discussed once before, but I >> don't think we acted on it. >> I agree that expression switch is an expression, and it should either yield a >> value or throw something; breaking out of the middle of an expression is not >> something we have, nor does it seem necessary. (Though I'm sure clever folks >> could come up with a good example where it would be convenient.) >> A sensible extension of this is no "return" from a switch expression either: >> int foo(int x) { >> return switch (x) { >> case 1 -> 2; >> case 2 -> 4; >> case 3 -> 8; >> default: return Integer.MAX_VALUE; >> } >> } >> Like conditionals, then, switch expressions would either yield a value (through >> breaking) or throw. This seems consistent, but...what happens when we nest a >> statement in a switch expression? >> void foo(int x, int y, int z) { >> TOP: >> switch (z) { >> case 1: >> int i = switch (x) { >> case 1 -> 2; >> case 2: >> switch (y) { >> case 1: return; >> default: break TOP; >> } >> } >> } >> } >> Do we disallow the "break TOP" and return in the inner switch? IOW, does the >> expression form a barrier through which control can only pass via break or >> exceptions? >> On 3/2/2018 9:30 AM, Remi Forax wrote: >>> Hi all, >>> as far as i remember, the current idea to differentiate between a break label >>> and a break value is to let the compiler figure this out, >>> i wonder if it's not simpler to disallow break label (and continue label) inside >>> an expression switch. >>> After all, an expression switch do not exist yet, so no backward compatibility >>> issue, it may make some refactoring impossible but had the great advantage to >>> do not allow a lot of puzzler codes like the one below. >>> enum Result { >>> ONE, MANY >>> } >>> Result result(String[] args) { >>> ONE: for(String s: args) { >>> return switch(s) { >>> case "several": >>> case "many": >>> break MANY; >>> case "one": >>> break ONE; >>> default: >>> continue; >>> }; >>> } >>> throw ...; >>> } >>> R?mi > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 2 20:10:13 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 2 Mar 2018 21:10:13 +0100 (CET) Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <552686D5-3197-4717-A1BA-B2182B89908E@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <48dc583f-73f1-2509-34d1-52af30e2cc5d@oracle.com> <552686D5-3197-4717-A1BA-B2182B89908E@oracle.com> Message-ID: <17957173.1910101.1520021413879.JavaMail.zimbra@u-pem.fr> > Envoy?: Vendredi 2 Mars 2018 20:45:52 > Objet: Re: Disallowing break label (and continue label) inside an expression > switch > On Mar 2, 2018, at 8:12 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> Do we disallow the "break TOP" and return in the inner switch? >> IOW, does the expression form a barrier through which control >> can only pass via break or exceptions? > IIRC last time we talked about this that was the consensus. > It's consistent with what Kevin just wrote also, about constraining > the result of a switch-expr. > We currently don't have any expressions which can branch, > just return normally or throw. Let's keep it that way. I'm glad we all agree here. > (Remember we had an option for lambdas to branch > outside the lambda expression, and we decisively shut > it down for various reasons.) > We *do* have expressions which can *internally branch* > without completing. Those are lambdas. Switch expressions > should also be able to have arbitrary control flow inside > (if the coder decides). But that "barrier" idea expresses > the key constraint: That the branch labels used inside > the switch are operationally disjoint from those outside. > (Separately, as a syntax constraint, labels enclosing > the same bit of syntax should be distinct. But that > doesn't mean an outer label is ever reachable from > inner control flow. Trying to reach across the barrier > must be an error.) > If someone wants to write code like Remi's puzzlers, > fine, but they have to refactor to a statement-switch, > and assign the result to a temp, like today. Statements > can branch to locations other than their standard > completion point. yes ! > ? John R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 8 19:36:12 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 8 Mar 2018 14:36:12 -0500 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> Message-ID: Jan has updated the prototype to make the switch expression a bubble penatrable only by exceptions.? Please take a look! On 3/2/2018 9:30 AM, Remi Forax wrote: > Hi all, > as far as i remember, the current idea to differentiate between a break label and a break value is to let the compiler figure this out, > i wonder if it's not simpler to disallow break label (and continue label) inside an expression switch. > > After all, an expression switch do not exist yet, so no backward compatibility issue, it may make some refactoring impossible but had the great advantage to do not allow a lot of puzzler codes like the one below. > > enum Result { > ONE, MANY > } > > Result result(String[] args) { > ONE: for(String s: args) { > return switch(s) { > case "several": > case "many": > break MANY; > case "one": > break ONE; > default: > continue; > }; > } > throw ...; > } > > R?mi From forax at univ-mlv.fr Fri Mar 9 21:19:18 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 9 Mar 2018 22:19:18 +0100 (CET) Subject: break seen as a C archaism Message-ID: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Hi guys, i've talked/do presentation to several people online and offline about the enhanced switch. While the overall feedback is positive (or very positive), the usage of break as a local return from a case expression is seen as a bad decision. For some people, it's just ugly, but they can live with that. For others, it elevates the status of break and break is seen as something wrong, an archaism from C. For one, we are "retarded" (it was in French but i think it's the right translation) because even C# do not use break :) When i asked what we should do instead, the answer is either: 1/ we should not allow block of codes in the expression switch but only expression 2/ that we should use the lambda syntax with return, even if the semantics is different from the lambda semantics. I do not like (1) because i think the expression switch will become useless but for (2) i think i was wrong to consider the semantics difference as something important. So should we backup and re-use the lambda semantics instead of enhancing break ? R?mi From kevinb at google.com Fri Mar 9 21:56:20 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 9 Mar 2018 13:56:20 -0800 Subject: Preconditions (for records, or otherwise) Message-ID: Hello, I'd claim it's an uncontroversial best practice that method and constructor parameters should be aggressively *checked for validity*, especially when that data is stored and used later (when knowledge of where that bad value came from has vanished). One thing I've been pushing for as a result is that the design of records really, really should not impose a disproportionate penalty to adding the first bit of validation. If to check that a number is positive I have to change record Foo(int num, String unrelated) {} to record Foo(int num, String unrelated) { Foo(int num, String unrelated) { if (num <= 0) ...; default.this(num, unrelated); } } ... then I'd say that the cost of listing my fields three times instead of once is too great, and the user may not bother. For records, the right amount of repetition really is no repetition. We've discussed addressing this in either a *records-specific* or a generalized way. The former is (imho) the least we can do to satisfy "first do no harm". This could be a matter of saying that a record's primary constructor gets to have various uninteresting boilerplate be *inferred* (though if we had a way to also get around the traditional annoyance of parameters and fields both being in scope at the same time with the same names, that might be even better). So I'd like to figure out what that would look like. (As a side product, maybe this solution solves the question of "where does a constructor annotation go?".) But I don't want to give up too easily on a more *general* approach that would apply to records, methods, and constructors. That's been sketched at times as void foo(int num, String unrelated) requires (num >= 0) { ... } where `requires` takes a boolean expression*, which lives in the same scope as the body proper; if it evaluates to false, an exception is thrown and the body is never entered. The main criticism I hear about this is that it feels like a *"method with two bodies"*. To that I'd point out that - it is only an *expression* -- and anything even moderately complex ought to be factored out, just like we advise for lambdas - this expression isn't implementation; it's contract, so frankly it *belongs *in this elevated place more than it does in the body. It is information that pertains, not really to the body, but to the communication between caller and body - just like the signature does. - this way, the preconditions can be *inherited* by default in an overriding method, which seems awfully convenient to me right now. (If you have some conditions you wouldn't want inherited for some reason, keep those in the regular body. I'm not sure whether these are *technically* LSP violations, but in pragmatic terms they don't seem to be, to me) I bring all this up because some of the upsides seem quite compelling to me: - The automatically composed exception *message* will be more useful than what 90% of users bother to string together (and the other 10% are wasting time and space dealing with it). - These expressions can be displayed in generated *documentation* so you don't have to write them out a second time in prose. - I admit this may feel weird for a core language feature, but you can choose the idiomatic exception *type* automatically: if the expression involved at least one parameter, it's IAE; otherwise it's probably ISE (except in the amusing case of `requires (false)` it is UOE). (Again, maybe this is too weird.) - Some of these expressions are *verifiable* statically. For example a call to `foo(-1, "x")` (using example above) should be caught by javac. I suppose we teach it to recognize cases like empty collections through compiler plugins. Note that the other design-by-contract idioms are still addressed well enough by `assert`; we only need this one because `assert` disclaims this use case (for good reason). Lastly... hey, what about just a *library* like Guava's Preconditions class? I made that thing, and it is extremely popular here. It also gives extraordinarily small benefit. Yeah, it lets you express your expectation positively instead of negatively. It lets you create a message with %s. That's about it. Yawn. Thoughts? (*why I say it should take one boolean expression, not a comma-separated list: I think we might as well let the user choose between short-circuiting or not, by using && and & directly, which makes it clear to readers as well. Well, that is, charitably assuming that reader remembers the difference.) -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 9 22:15:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Mar 2018 17:15:21 -0500 Subject: break seen as a C archaism In-Reply-To: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: <26915b4d-6ba0-f7ae-b377-9f7867eb4fab@oracle.com> I understand where these people are coming from.? But my experience is, with a platform as mature as Java, one must be very, very careful of the desire to opportunistically "fix" "mistakes" of the past, it can be a siren song that draws you to the rocks.? I am skeptical of arguments that "we should kill break (or at least, not make it more important), because it's old stuff and we're young and hip", even though I have a certain sympathy for this argument.? (Well, I'm old, and was never hip, but I'd like to try it someday.) Fallthrough is certainly one of the biggest punching bags of Java.? However, the problem with fallthrough is not fallthrough itself, but the fact that fallthrough is the default, when 98% of the time you do not want fallthrough.? That's a separate problem -- and might admit a separate solution. In switch expressions, 98% of the time, except maybe in the default arm, no one will ever have to type break, because you can usually say: ??? int x = switch (y) { ??????? case 1 -> 2; ??????? case 2 -> 4; ??????? case 3 -> 8; ??????? default: ??????????? throw new TooBigException(); ??? } See, no break.? But sometimes, you will need it. There are basically two stable ways we can go here: ?- Renovate switch as best we can to support expressions and patterns. ?- Leave switch in the legacy bin, and make a new construct, say "match".? (Note that doing this helps with the fallthrough-by-default, but doesn't really help switch expressions at all -- we still have to solve the same problem.) There are costs to both, of course.? (Engineers tend to over-rotate towards the second because it seems more fun and modern, but sticking with what works, and what Java developers _already understand_, is often better.)? Our current strategy is to stick with what works until that approach is proven unworkable. I think trying to "tame" switch is less stable; if we're going to stick with switch, we should avoid unnecessary discontinuities between make statement and expression switch. > For others, it elevates the status of break and break is seen as something wrong, an archaism from C. I think this is really another form of the emotional "its ugly" reaction. > When i asked what we should do instead, the answer is either: > 1/ we should not allow block of codes in the expression switch but only expression This option is not only dislikable (as you suggest), but naive. While most of the time, you can say what you want in one expression, there will be times where you'll want to do things like the following: ??? String y = switch (x) { ??????? case 1: ??????????? System.out.println("DEBUG: found where the 1 is coming from!"); ??????????? break "one"; ?????? case 2: ?????????? if (throwOnTwo) ?????????????? throw new TwoException(); ?????????? else ?????????????? break "two"; ?????? case 3: ??????????? StringMaker sm = new StringMaker(); ??????????? sm.setSeed(System.currentTimeMillis()); ??????????? break sm.randomString(); ??? } While all of these are likely to be infrequent, telling people "just refactor your switch to a chain of if-then-else if you want to do that" is going to go over like the proverbial lead balloon. So, no to that one. > 2/ that we should use the lambda syntax with return, even if the semantics is different from the lambda semantics. Yes, we considered this, and actually thought this was a clever idea for a short while.? (General lesson: beware of clever-seeming ideas.)? But, among other problems, reusing "return" in this manner is even more of an abuse than reusing "break".? It's a credible choice, but it has its own problems too. > So should we backup and re-use the lambda semantics instead of enhancing break ? Pesonally, I think if we're going to stick with switch, the current proposal -- which generalizes the existing switch semantics -- is staying more true to what switch is.? If we find we have to abandon switch and do something else, then I think many more options are on the table. BTW, I think most people misunderstand the current proposal because its usually explained backwards.? So let me explain it forwards, and I think it makes more sense this way. STEP 1: Extend switch so it can be an expression.? Extend break so it can take a value to be yielded from the switch expression. This means that everything about statement switches and expression switches are the same, except that a statement switch can terminate nonlocally and an expression switch can't.? But it is a very straightforward extension of the control flow of switch to expressions, and that's a plus.? What's ugly about it is that you have to say break a lot: ??? int x = switch (y) { ??????? case 1: break 2; ??????? case 2: break 4; ??????? case 3: break 8; ??????? default: throw new TooBigException(); ?? } Which brings us to step two, which is purely a syntactic optimization for the very common case where an expression switch arm has no statements other than break: STEP 2: In a switch expression, allow: ??? case label -> e as shorthand for ??? case label: break e (much as an expression lambda is a shorthand for a statement lambda.) If you explain it the other way, people think that -> is what makes the switch an expression switch, and then the break rule seems weirder. From kevinb at google.com Fri Mar 9 22:49:21 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 9 Mar 2018 14:49:21 -0800 Subject: break seen as a C archaism In-Reply-To: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax wrote: When i asked what we should do instead, the answer is either: > 1/ we should not allow block of codes in the expression switch but only > expression > 2/ that we should use the lambda syntax with return, even if the > semantics is different from the lambda semantics. > > I do not like (1) because i think the expression switch will become useless In our (large) codebase, +Louis determined that, among switch statements that appear translatable to expression switch, 13.8% of them seem to require at least one multi-statement case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 9 22:55:53 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Mar 2018 17:55:53 -0500 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: Did you happen to calculate what percentage was _not_ the "default" case?? I would expect that to be a considerable fraction. On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: > On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax > wrote: > > When i asked what we should do instead, the answer is either: > ? 1/ we should not allow block of codes in the expression switch > but only expression > ? 2/ that we should use the lambda syntax with return, even if the > semantics is different from the lambda semantics. > > I do not like (1) because i think the expression switch will > become useless > > > In our (large) codebase, +Louis determined that, among switch > statements that appear translatable to expression switch, 13.8% of > them seem to require at least one multi-statement case. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 9 23:24:06 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 10 Mar 2018 00:24:06 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <26915b4d-6ba0-f7ae-b377-9f7867eb4fab@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <26915b4d-6ba0-f7ae-b377-9f7867eb4fab@oracle.com> Message-ID: <1310028132.2229357.1520637846539.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > Envoy?: Vendredi 9 Mars 2018 23:15:21 > Objet: Re: break seen as a C archaism > I understand where these people are coming from.? But my experience is, > with a platform as mature as Java, one must be very, very careful of the > desire to opportunistically "fix" "mistakes" of the past, it can be a > siren song that draws you to the rocks.? I am skeptical of arguments > that "we should kill break (or at least, not make it more important), > because it's old stuff and we're young and hip", even though I have a > certain sympathy for this argument.? (Well, I'm old, and was never hip, > but I'd like to try it someday.) > > Fallthrough is certainly one of the biggest punching bags of Java. > However, the problem with fallthrough is not fallthrough itself, but the > fact that fallthrough is the default, when 98% of the time you do not > want fallthrough.? That's a separate problem -- and might admit a > separate solution. > > In switch expressions, 98% of the time, except maybe in the default arm, > no one will ever have to type break, because you can usually say: > > ??? int x = switch (y) { > ??????? case 1 -> 2; > ??????? case 2 -> 4; > ??????? case 3 -> 8; > ??????? default: > ??????????? throw new TooBigException(); > ??? } > > See, no break.? But sometimes, you will need it. > > There are basically two stable ways we can go here: > ?- Renovate switch as best we can to support expressions and patterns. > ?- Leave switch in the legacy bin, and make a new construct, say > "match".? (Note that doing this helps with the fallthrough-by-default, > but doesn't really help switch expressions at all -- we still have to > solve the same problem.) > > There are costs to both, of course.? (Engineers tend to over-rotate > towards the second because it seems more fun and modern, but sticking > with what works, and what Java developers _already understand_, is often > better.)? Our current strategy is to stick with what works until that > approach is proven unworkable. > > I think trying to "tame" switch is less stable; if we're going to stick > with switch, we should avoid unnecessary discontinuities between make > statement and expression switch. > >> For others, it elevates the status of break and break is seen as something >> wrong, an archaism from C. > > I think this is really another form of the emotional "its ugly" reaction. > >> When i asked what we should do instead, the answer is either: >> 1/ we should not allow block of codes in the expression switch but only >> expression > > This option is not only dislikable (as you suggest), but naive. While > most of the time, you can say what you want in one expression, there > will be times where you'll want to do things like the following: > > ??? String y = switch (x) { > ??????? case 1: > ??????????? System.out.println("DEBUG: found where the 1 is coming from!"); > ??????????? break "one"; > > ?????? case 2: > ?????????? if (throwOnTwo) > ?????????????? throw new TwoException(); > ?????????? else > ?????????????? break "two"; > > ?????? case 3: > ??????????? StringMaker sm = new StringMaker(); > ??????????? sm.setSeed(System.currentTimeMillis()); > ??????????? break sm.randomString(); > ??? } > > While all of these are likely to be infrequent, telling people "just > refactor your switch to a chain of if-then-else if you want to do that" > is going to go over like the proverbial lead balloon. > > So, no to that one. > >> 2/ that we should use the lambda syntax with return, even if the semantics is >> different from the lambda semantics. > > Yes, we considered this, and actually thought this was a clever idea for > a short while.? (General lesson: beware of clever-seeming ideas.)? But, > among other problems, reusing "return" in this manner is even more of an > abuse than reusing "break".? It's a credible choice, but it has its own > problems too. > >> So should we backup and re-use the lambda semantics instead of enhancing break ? > > Pesonally, I think if we're going to stick with switch, the current > proposal -- which generalizes the existing switch semantics -- is > staying more true to what switch is.? If we find we have to abandon > switch and do something else, then I think many more options are on the > table. > > BTW, I think most people misunderstand the current proposal because its > usually explained backwards.? So let me explain it forwards, and I think > it makes more sense this way. > > STEP 1: Extend switch so it can be an expression.? Extend break so it > can take a value to be yielded from the switch expression. > > This means that everything about statement switches and expression > switches are the same, except that a statement switch can terminate > nonlocally and an expression switch can't.? But it is a very > straightforward extension of the control flow of switch to expressions, > and that's a plus.? What's ugly about it is that you have to say break a > lot: > > ??? int x = switch (y) { > ??????? case 1: break 2; > ??????? case 2: break 4; > ??????? case 3: break 8; > ??????? default: throw new TooBigException(); > ?? } > > Which brings us to step two, which is purely a syntactic optimization > for the very common case where an expression switch arm has no > statements other than break: > > STEP 2: In a switch expression, allow: > > ??? case label -> e > > as shorthand for > > ??? case label: break e > > (much as an expression lambda is a shorthand for a statement lambda.) > > If you explain it the other way, people think that -> is what makes the > switch an expression switch, and then the break rule seems weirder. I think part of the confusion comes from the fact that we reuse '->' which is strongly associated to lambdas, perhaps a way to avoid that is to not reuse the arrow. When you say expression lambda is a shorthand for statement lambda, nevertheless, both form use the arrow, again, what if we do not use an arrow for the shorthand case ? Technically, i do not think we need a symbol at all, but it will be ugly because we also want to have multiple values for the case separated by comma. So let say we need a symbol but not '->', perhaps ':>' may work ? int x = switch (y) { case 1 :> 2; case 2 :> 4; case 3 :> 8; default: throw new TooBigException(); }; R?mi From brian.goetz at oracle.com Fri Mar 9 23:24:37 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Mar 2018 18:24:37 -0500 Subject: Preconditions (for records, or otherwise) In-Reply-To: References: Message-ID: > Hello, > > I'd claim it's an uncontroversial best practice that method and > constructor parameters should be aggressively *checked for validity*, > especially when that data is stored and used later (when knowledge of > where that bad value came from has vanished). Yes, and especially so for immutable objects (which we'd like to nudge towards somehow, such as making `final` the default for record fields and requiring an explicit `non-final`.) I think what you're getting at is that failing to validate in records will become an attractive nuisance; because we've stripped away all the boilerplate, reconstructing a place to hang validation code will be on the wrong side of the activation energy leap, and anything we can do to reduce the cost of responsible validation is worth doing.? I tend to agree; if people perceive validation as inconvenient, they won't do it, and our data will be of lower quality.? Which is not the goal ;) > One thing I've been pushing for as a result is that the design of > records really, really should not impose a disproportionate penalty to > adding the first bit of validation. If to check that a number is > positive I have to change > > ?record Foo(int num, String unrelated) {} > > to > > ?record Foo(int num, String unrelated) { > ? ?Foo(int num, String unrelated) { > ? ? ?if (num <= 0) ...; > ? ? ?default.this(num, unrelated); > ? ?} > ?} > > ... then I'd say that the cost of listing my fields three times > instead of once is too great, and the user may not bother. For > records, the right amount of repetition really is no repetition. In general, I am wary about arguments that sound like catering to Billy Boilerplate, but in this case I agree -- validation is so important, we don't want to give people any excuses to not do it. First, let's catalog the repetition. I think we can surely eliminate the `default.this(num, unrelated)` call; the parser production can tell the difference between a constructor body that contains an explicit constructor call (super/this/default.this) and one that does not.? So if the user provides a constructor without an explicit constructor call, we can just fill in the default initialization.? (There are arguments in favor of both putting it at the beginning and at the end; I'll take that up separately.)? That gets us down to: ?record Foo(int num, String unrelated) { ? ?Foo(int num, String unrelated) { ? ? ?if (num <= 0) ...; ?? } ?} Which is slightly better.? The remaining repetition is the argument list of the constructor is repeated.? On the one hand, this seems at first to be incompressible: Java members have signatures.? But if we're willing to do something special for records, we can take advantage of the fact that a record _must_ have a constructor (the "primary constructor") that matches the record signature, and let the user leave this out if they are declaring an explicit primary constructor: ?record Foo(int num, String unrelated) { ? ?Foo{ ? ? ?if (num <= 0) ...; ?? } ?} or ?record Foo(int num, String unrelated) { ?? primary-constructor { ? ? ?if (num <= 0) ...; ?? } ?} or similar could be a shorthand for declaring the full primary constructor signature.? I think that gets you down to your minimum, without losing much in the way of readability. > We've discussed addressing this in either a *records-specific* or a > generalized way. The former is (imho) the least we can do to satisfy > "first do no harm". This could be a matter of saying that a record's > primary constructor gets to have various uninteresting boilerplate be > *inferred* (though if we had a way to also get around the traditional > annoyance of parameters and fields both being in scope at the same > time with the same names, that might be even better). So I'd like to > figure out what that would look like. (As a side product, maybe this > solution solves the question of "where does a constructor annotation > go?".) The above is a record-specific approach, which takes advantage of the fact that a record must have a constructor that matches the record signature. I am a little more wary of dealing with the "pun" between the fields and the constructor arguments in a record-specific way, since I'd like for there to be a simple mechanical refactoring between records and classes that could be records. If the compiler is going to fill in the super call / field initialization, we have two choices: fill them in at the beginning, or at the end, of the constructor.? Both have pros and cons. General: we can lift the restriction of "no statements before this/super" (for years, we couldn't, until some verifier improvements made this possible).? In that case, statements before the constructor call have `this` as DU, so you can't do field access in that part of the ctor, you can only work on the arguments: ??? Foo(int x, int y) { ??????? if (x < y) throw ...?? // `this` is DU ??????? this(...);?????????????????? // now `this` is DA ??????? more stuff ??? } If we are presented with ? ?Foo(int num, String unrelated) { ? ? ?if (num <= 0) ...; ?? } we could desugar that into ? ?Foo(int num, String unrelated) { ? ? ?if (num <= 0) ...; ? ? ?default.this(num, unrelated); ? ?} or ? ?Foo(int num, String unrelated) { ? ? ?default.this(num, unrelated); ? ?? if (num <= 0) ...; ?? } Checking preconditions after you create the object feels wrong to me, so the former speaks to me.? It also has the benefit that you can normalize the fields without having to explicitly provide the default call: ?? Foo(int num, String unrelated) { ? ? ?if (num <= 0) num = 0; ? ?? // implicit default, writes normalized num to this.num ? ?} The only time you'd need an explicit default under this interpretation is if you had additional fields to initialize in the constructor. Putting the implicit default first is more like how things work now, but it means you can only check the values after you write the fields, and you can't normalize them without an explicit default. > Lastly... hey, what about just a *library*?like Guava's Preconditions > class? I made that thing, and it is extremely popular here. It also > gives extraordinarily small benefit. Yeah, it lets you express your > expectation positively instead of negatively. It lets you create a > message with %s. That's about it. Yawn. I'd be in favor; seems a pretty big return-on-complexity.? The closest thing we have in JDK now is Objects.requireNonNull(), which I use a lot but is obviously a one-trick pony. Other ideas along these lines include: ?- A statement, analogous to assert, but which is unconditional ("validate"); ?- Compiler and documentation heuristics for Preconditions, where, if a method begins with a block of preconditions, they are propagated into the documentation; ?- Better Javadoc support for describing preconditions, so they don't have to be spelled out longhand. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lowasser at google.com Fri Mar 9 23:21:19 2018 From: lowasser at google.com (Louis Wasserman) Date: Fri, 09 Mar 2018 23:21:19 +0000 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: Simplifying: let's call normal cases in a switch simple if they're a single statement or a no-op fallthrough, and let's call a default simple if it's a single statement or it's not there at all. Among switches apparently convertible to expression switches, - 81% had all simple normal cases and a simple default. - 5% had all simple normal cases and a nonsimple default. - 12% had a nonsimple normal case and a simple default. - 2% had a nonsimple normal case and a nonsimple default. I think Kevin was looking at a table that didn't take the default into account. On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz wrote: > Did you happen to calculate what percentage was _not_ the "default" case? > I would expect that to be a considerable fraction. > > On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: > > On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax wrote: > > When i asked what we should do instead, the answer is either: >> 1/ we should not allow block of codes in the expression switch but only >> expression >> 2/ that we should use the lambda syntax with return, even if the >> semantics is different from the lambda semantics. >> >> I do not like (1) because i think the expression switch will become >> useless > > > In our (large) codebase, +Louis determined that, among switch statements > that appear translatable to expression switch, 13.8% of them seem to > require at least one multi-statement case. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Mar 10 00:46:52 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 9 Mar 2018 19:46:52 -0500 Subject: Preconditions (for records, or otherwise) In-Reply-To: References: Message-ID: <12B5E879-92D2-4697-A7B2-05DD7E5E093A@oracle.com> > On Mar 9, 2018, at 6:24 PM, Brian Goetz wrote: > > . . . That gets us down to: > > record Foo(int num, String unrelated) { > Foo(int num, String unrelated) { > if (num <= 0) ...; > } > } > > Which is slightly better. The remaining repetition is the argument list of the constructor is repeated. On the one hand, this seems at first to be incompressible: Java members have signatures. But if we're willing to do something special for records, we can take advantage of the fact that a record _must_ have a constructor (the "primary constructor") that matches the record signature, and let the user leave this out if they are declaring an explicit primary constructor: > > record Foo(int num, String unrelated) { > Foo { > if (num <= 0) ...; > } > } > > or > > record Foo(int num, String unrelated) { > primary-constructor { > if (num <= 0) ...; > } > } > > or similar could be a shorthand for declaring the full primary constructor signature. I think that gets you down to your minimum, without losing much in the way of readability. Ya know, as long as we are talknig about a mandatory constructor, if that special keyword ?primary-constructor? were spelled ?required?, it would look achingly familiar: record Foo(int num, String unrelated) { required { if (num <= 0) ...; } } Compare to: record Foo(int num, String unrelated) requires (num <= 0) { } ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Sat Mar 10 02:21:01 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 9 Mar 2018 18:21:01 -0800 Subject: break seen as a C archaism In-Reply-To: <26915b4d-6ba0-f7ae-b377-9f7867eb4fab@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <26915b4d-6ba0-f7ae-b377-9f7867eb4fab@oracle.com> Message-ID: <40E2E0CB-2560-4468-84A4-82F9DB9FDA4D@oracle.com> You said it all. +100 On Mar 9, 2018, at 2:15 PM, Brian Goetz wrote: > > I understand where these people are coming from. But my experience is, with a platform as mature as Java, one must be very, very careful of the desire to opportunistically "fix" "mistakes" of the past, it can be a siren song that draws you to the rocks. I am skeptical? > ... > Pesonally, I think if we're going to stick with switch, the current proposal -- which generalizes the existing switch semantics -- is staying more true to what switch is. If we find we have to abandon switch and do something else, then I think many more options are on the table. I'll add that switch has accommodated all of our upgrades surprisingly well, to the point that I am very hopeful we don't have to pay the very costs of a totally new construct. The major cost, IMO, would be confusing the language by providing two complex but incompatible constructs with strongly overlapping functionality. I'd very much rather add new constructs because the language can't really express something type-safely, or because there is a 30x gain in compactness for some common construct. Then the overlap issue doesn't appear. ? John From brian.goetz at oracle.com Mon Mar 12 17:48:35 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 13:48:35 -0400 Subject: Records: construction and validation Message-ID: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Here's a sketch of where our thinking is right now for construction and validation. General goal: As Kevin pointed out, we should make adding incremental validation easy, otherwise people won't do it, and the result is worse code.? It should be simple to add validation (and possibly also normalization) logic to constructors without falling off the syntactic cliff, either in the declaration or the body of the constructor. All records have a /default constructor/.? This is the one whose signature matches the class signature.? If you don't have an explicit one, you get an implicit one, regardless of whether or not there are other constructors. If you have records: ??? abstract record A(int a) { } ??? record B(int a, int b) extends A(a) { } then the behavior of the default constructor for B is: ??? super(a); ??? this.b = b; If you want to provide an explicit constructor to ensure, for example, that b > 0, you could just say it yourself: ??? public B(int a, int b) { ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??????? super(a); ??????? this.b = b; ??? } Wait, wait a second...? I thought we couldn't put statements ahead of the super-call? DIGRESSION... Historically, this() or super() must be first in a constructor. This restriction was never popular, and perceived as arbitrary. There were a number of subtle reasons, including the verification of invokespecial, that contributed to this restriction.? Over the years, we've addressed these at the VM level, to the point where it becomes practical to consider lifting this restriction, not just for records, but for all constructors. Currently a constructor follows a production like: ??? [ explicit-ctor-invocation ] statement* We can extend this to be: ??? statement* [ explicit-ctor-invocation statement* ] and treat `this` as DU in the statements in the first block. ...END DIGRESSION OK, so we can put a statement ahead of the super-call.? But this explicit declaration is awfully verbose.? We can trim this by: ?- Allow the compiler to infer the signature for the default constructor, if none is provided; ?- Provide a shorthand for "just do the default initialization". Now we get: ??? public B { ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??????? default.this(a, b); ??? } There's still some repetition here; it would be nice if the default initialization were inferred as well.? Which leads to a question: if we have a record constructor with no explicit constructor call, do we do the default initialization at the beginning or the end?? In other words, does this: ??? public B { ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??? } mean ??? public B { ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??????? default.this(a, b); ??? } or this: public B { default.this(a, b); if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??? } The two are subtly different, and the difference becomes apparent if we want to normalize arguments or make defensive copies, not just validate: public B { ??????? if (b <= 0) ??????????? b = 0; ??? } If we put our implicit construction at the beginning, this would be a dead assignment to the parameter, after the record was initialized, which is almost certainly not what the user meant.? If we put it at the end, this would pick up the update.? The former seems pretty error-prone, so the latter seems attractive. However, this runs into another issue, which is: what if we have additional fields?? (We might disallow this, but we might not.)? Now what if we wanted to do: ??? record B(int a, int b) { ??????? int cachedSum; ??????? B { ??????????? cachedSum = a + b; ??????? } ??? } If we treat the explicit statements as occuring before the default initialization, now `this` is DU at the point of assigning `cachedSum`, and the compiler tells us that we can't do this.? Of course, there's a workaround: B { default.this(a, b); cachedSum = a + b; ??????? } which might be good enough. (Note that we'd like to be able to extend this ability to constructors of classes other than records eventually, so we should work out the construction protocol in generality even if we're not going to do it all now.) Is `default.this(a, b)` still too verbose/general/error-prone?? Would some more invariant marker ("do the default thing now") be better, like: ??? B { ??????? new; this.cachedSum = a + b; ??? } So, summarizing: ?- We're OK with Foo { ... } as shorthand for the default constructor? ?- Where should the implicit construction go -- beginning or end? ?- Should there be a better idiom other than default.this(args) for "do the explicit construction now"? From kevinb at google.com Mon Mar 12 18:25:29 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 12 Mar 2018 11:25:29 -0700 Subject: Records: construction and validation In-Reply-To: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Message-ID: On Mon, Mar 12, 2018 at 10:48 AM, Brian Goetz wrote: Historically, this() or super() must be first in a constructor. This > restriction was never popular, and perceived as arbitrary. > It is *very very occasionally* annoying. Do we have enough motivation to change it? Are they any circumstances today in which I have to worry that `this` might be unassigned, or would that be new? If I reviewed the code above, I would move the `super` to the top anyway. It's how we've always done it and there's nothing at all wrong with it, since it's an exception case. What are the more motivating examples? (OTOH, this is just a tangent to the main thread.) > If we put our implicit construction at the beginning, this would be a dead > assignment to the parameter, after the record was initialized, which is > almost certainly not what the user meant. If we put it at the end, this > would pick up the update. The former seems pretty error-prone, so the > latter seems attractive. > You've reminded me that this is how we make defensive copies, which I would call critically important for records, so yes. However, this runs into another issue, which is: what if we have additional > fields? (We might disallow this, but we might not.) Let's go to Crazy Town for a second... (and I mean it, this could be insane) Today, field initializers and instance initializers certainly don't have any constructor parameters in scope, because they apply to *all* constructors. But for records we've discussed mandating that all constructors must funnel through the primary one (which I think is good). That means there is really only one true constructor. Is it insane to say that initializers, then, only apply to that primary constructor, and ergo we allow that constructor's parameters to be referenced in initializers? Consequence 1: you could do public record B(int a, int b) { int cachedSum = a + b; } Consequence 2: maybe the precondition example doesn't have to be public record B(int a, int b) { public B { if (b < 0) throw... } } but simply public record B(int a, int b) { { if (b < 0) throw... } } (again, imho, "the right amount of repetition is no repetition") I like this outcome, but I would imagine that when we pull on the thread of everything else that follows from a decision like this we probably won't end up liking it... -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 12 19:10:44 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 15:10:44 -0400 Subject: Records: construction and validation In-Reply-To: References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Message-ID: <48dc47a1-409d-ec4f-ae0d-a72b44f97604@oracle.com> > > Let's go to Crazy Town for a second... (and I mean it, this could be > insane) > > Today, field initializers and instance initializers certainly don't > have any constructor parameters in scope, because they apply to > /all/?constructors. But for records we've discussed mandating that all > constructors must funnel through the primary one (which I think is > good). That means there is really only one true constructor. Is it > insane to say that initializers, then, only apply to that primary > constructor, and ergo we allow that constructor's parameters to be > referenced in initializers? It's not insane, but it does have cost.? Let me pull on that thread ... Right now, { ... } has a meaning in classes, which is that it runs before the constructor (with field initializers).? All things being equal, we'd like for constructs that are common to records and classes to mean the same thing in both; not only does this minimize confusion, but it also plays into a bigger goal for records, which is: records are "just" a macro for a specific class declaration. This goal minimizes the perceived complexity for users ("this thing is just like this other thing"), and also simplifies the story for refactoring back and forth. What you're really saying is to change the timing of instance initializers for records, to run _inside_ the default constructor, after the super-call / default field initializations.? This is one subtle difference from classes; the other is that the construction parameters are in scope (meaning that the meaning of `x` changes too.)? Arguably, you should do the same for field initializers. This seems a prety big change, to eliminate an utterance of "Foo". > (again, imho, "the right amount of repetition is no repetition") I'd say instead: every bit of repetition should carry its weight? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 12 21:53:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 17:53:21 -0400 Subject: Updated Constables and constant folding doc Message-ID: <7fb7f95b-6f0e-00c0-cba5-6fbae508a649@oracle.com> I've updated the document on the ConstantRef / Constable API, and added more detail on constant folding and propagation optimizations: ??? http://cr.openjdk.java.net/~briangoetz/amber/constables.html From brian.goetz at oracle.com Mon Mar 12 22:03:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 18:03:21 -0400 Subject: Records: construction and validation In-Reply-To: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Message-ID: > Which leads to a question: if we have a record constructor with no explicit constructor call, do we do the default initialization at the beginning or the end? Thinking some more, I think there's a better (but somewhat surprising) answer: In all classes, the compiler inserts a super call if there is no explicit super/this, and it does so at the beginning.? This argues for putting the implicit initialization at the top, since we already do something similar elsewhere.? But ... this still gets in the way of normalizing fields of the current class.? So, let's refine this answer; if there is no explicit super/this, we put the implicit *super call* at the _top_ and the implicit *field initialization* _at the bottom_.? So given ??? abstract record A(int a) { } ??? record B(int a, int b) extends A(a) { } then ??? public B { ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??? } is really shorthand for ??? public B { ??????? super(a);? // implicit ??????? if (b <= 0) ??????????? throw new IllegalArgumentException("b"); ??????? this.b = b;? // implicit ??? } This preserves consistency with implicit super in other cases, and also allows explicit code to normalize parameters without additional ceremony, as in: ??? B { ??????? if (b < 0) b = 0; ??? } Here, the assignment to the parameter b will happen before the implicit assignment of `this.b = b`, so we get what we expect.? And in: ??? B { ??????? cachedSum = a + b; ???? } the receiver is now DA at this point, so this is allowed too.? (If you happen to reference this.b, you'll be told this is DU, so you can't make a mistake there.)? So in almost no circumstances will you need an explicit default.this(...) call. From forax at univ-mlv.fr Mon Mar 12 22:31:45 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 12 Mar 2018 23:31:45 +0100 (CET) Subject: Records: construction and validation In-Reply-To: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Message-ID: <1782502169.431728.1520893905796.JavaMail.zimbra@u-pem.fr> Thinking only in term of constructor is only half of the story of preconditions, you also need precondition on setters (and getters for the defensive copy), that's the advantage of moving the requirements in its own 'section', you can distribute the code to the different generated part. By example, if requires say that x is positive record B(int x) requires x >= 0; it means that the constructor and the x() throw an exception if the parameter x is not positive. R?mi BTW normalizing arguments is nice way to lost the trust of your users, the behavior is obvious if you are the writer of the class and it's a real pain if you are the poor user that had to figure out why when he send a value, the code acts as if it is another value. So normalization should always occur at call site and never at callee site. And it's the same issue with default values. ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 12 Mars 2018 18:48:35 > Objet: Records: construction and validation > Here's a sketch of where our thinking is right now for construction and > validation. > > General goal: As Kevin pointed out, we should make adding incremental > validation easy, otherwise people won't do it, and the result is worse > code.? It should be simple to add validation (and possibly also > normalization) logic to constructors without falling off the syntactic > cliff, either in the declaration or the body of the constructor. > > All records have a /default constructor/.? This is the one whose > signature matches the class signature.? If you don't have an explicit > one, you get an implicit one, regardless of whether or not there are > other constructors. > > If you have records: > > ??? abstract record A(int a) { } > ??? record B(int a, int b) extends A(a) { } > > then the behavior of the default constructor for B is: > > ??? super(a); > ??? this.b = b; > > If you want to provide an explicit constructor to ensure, for example, > that b > 0, you could just say it yourself: > > ??? public B(int a, int b) { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? super(a); > ??????? this.b = b; > ??? } > > Wait, wait a second...? I thought we couldn't put statements ahead of > the super-call? > > DIGRESSION... > > Historically, this() or super() must be first in a constructor. This > restriction was never popular, and perceived as arbitrary. There were a > number of subtle reasons, including the verification of invokespecial, > that contributed to this restriction.? Over the years, we've addressed > these at the VM level, to the point where it becomes practical to > consider lifting this restriction, not just for records, but for all > constructors. > > Currently a constructor follows a production like: > > ??? [ explicit-ctor-invocation ] statement* > > We can extend this to be: > > ??? statement* [ explicit-ctor-invocation statement* ] > > and treat `this` as DU in the statements in the first block. > > ...END DIGRESSION > > OK, so we can put a statement ahead of the super-call.? But this > explicit declaration is awfully verbose.? We can trim this by: > ?- Allow the compiler to infer the signature for the default > constructor, if none is provided; > ?- Provide a shorthand for "just do the default initialization". > > Now we get: > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? default.this(a, b); > ??? } > > There's still some repetition here; it would be nice if the default > initialization were inferred as well.? Which leads to a question: if we > have a record constructor with no explicit constructor call, do we do > the default initialization at the beginning or the end?? In other words, > does this: > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??? } > > mean > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? default.this(a, b); > ??? } > > or this: > > public B { > default.this(a, b); > if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??? } > > The two are subtly different, and the difference becomes apparent if we > want to normalize arguments or make defensive copies, not just validate: > > public B { > ??????? if (b <= 0) > ??????????? b = 0; > ??? } > > If we put our implicit construction at the beginning, this would be a > dead assignment to the parameter, after the record was initialized, > which is almost certainly not what the user meant.? If we put it at the > end, this would pick up the update.? The former seems pretty > error-prone, so the latter seems attractive. > > However, this runs into another issue, which is: what if we have > additional fields?? (We might disallow this, but we might not.)? Now > what if we wanted to do: > > ??? record B(int a, int b) { > ??????? int cachedSum; > > ??????? B { > ??????????? cachedSum = a + b; > ??????? } > ??? } > > If we treat the explicit statements as occuring before the default > initialization, now `this` is DU at the point of assigning `cachedSum`, > and the compiler tells us that we can't do this.? Of course, there's a > workaround: > > B { > default.this(a, b); > cachedSum = a + b; > ??????? } > > which might be good enough. (Note that we'd like to be able to extend > this ability to constructors of classes other than records eventually, > so we should work out the construction protocol in generality even if > we're not going to do it all now.) > > Is `default.this(a, b)` still too verbose/general/error-prone?? Would > some more invariant marker ("do the default thing now") be better, like: > > ??? B { > ??????? new; > this.cachedSum = a + b; > ??? } > > > > So, summarizing: > ?- We're OK with Foo { ... } as shorthand for the default constructor? > ?- Where should the implicit construction go -- beginning or end? > ?- Should there be a better idiom other than default.this(args) for "do > the explicit construction now"? From brian.goetz at oracle.com Mon Mar 12 22:36:53 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 18:36:53 -0400 Subject: Records: construction and validation In-Reply-To: <1782502169.431728.1520893905796.JavaMail.zimbra@u-pem.fr> References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> <1782502169.431728.1520893905796.JavaMail.zimbra@u-pem.fr> Message-ID: > > Thinking only in term of constructor is only half of the story of preconditions, > you also need precondition on setters (and getters for the defensive copy), that's the advantage of moving the requirements in its own 'section', you can distribute the code to the different generated part. Yes, and also this connects with generation of destructuring match too. And yes, I want to line the story for records up with the ultimate story for how these are done more generally. And yes, I have some thoughts on this too. But, I?m trying to keep focused on the problem immediately ahead of us, so that we can make forward progress. So, let?s come back to that ... > BTW normalizing arguments is nice way to lost the trust of your users, the behavior is obvious if you are the writer of the class and it's a real pain if you are the poor user that had to figure out why when he send a value, the code acts as if it is another value. So normalization should always occur at call site and never at callee site. And it's the same issue with default values. While I agree that there are risks to normalization, sometimes its a reasonable thing to do. I don?t want to tell users that they can?t (say) reduce a Rational to lowest terms in the constructor. So I?d like a mechanism that supports normalization where needed. So far, having thought about it for an hour, I really like the ?super in the front, field initializations in the back? interpretation. Its a nice generalization of what we have already, and it does the unsurprising thing in all the use cases I?ve come up with so far. From forax at univ-mlv.fr Mon Mar 12 23:10:43 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 13 Mar 2018 00:10:43 +0100 (CET) Subject: Updated Constables and constant folding doc In-Reply-To: <7fb7f95b-6f0e-00c0-cba5-6fbae508a649@oracle.com> References: <7fb7f95b-6f0e-00c0-cba5-6fbae508a649@oracle.com> Message-ID: <1485976632.434870.1520896243972.JavaMail.zimbra@u-pem.fr> Hi Brian, the way the constant folding is done in this proposal scare me a lot, pre-linking is usually a bad idea when done in the compiler because it means that the world the compiler sees and the world VM sees has to be exactly the same, and there is nothing in the spec that guarantee that the compiler and the VM are the same. I've already be beaten badly by smart compilers in the past, if you have a bug in one constant folded method, you have as many bugs as the number of class files generated by the compiler. We already have that bug that static final constants that are resolved at compile time instead of at runtime. i see your proposal as a way to add more bugs. We have ConstantDynamic which is a way to do an arbitrary evaluation of a value at runtime and consider it as a constant. This has the advantage of doing the constant folding at runtime. Doing the constant folding at runtime is obviously slower but it eliminates a whole class of bugs. It works that way, when a method is marked at @ConstantFoldable, the compiler will create a synthetic method (like with a lambda) that will be used to initialize the condy. By example, String s = `a long multi- line string`.trimIndent(); is compiled to ldc Condy String ConstantFoldableMetaFactory [constantFoldable$0 as a constant method handle] with private static String constantFoldable$0() { return `a long multi- line string`.trimIndent(); } with the bsm of ConstantFoldableMetaFactory calling the constant method handle sent as bootstrap argument after having verified it was created from the same lookup as the one sent as first argument of the BSM. This also avoid to have the compiler executing an arbitrary code. Note: in the section decapturing of lambda, the 'prefix' is called 'p' inside of the lambda. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 12 Mars 2018 22:53:21 > Objet: Updated Constables and constant folding doc > I've updated the document on the ConstantRef / Constable API, and added > more detail on constant folding and propagation optimizations: > > ??? http://cr.openjdk.java.net/~briangoetz/amber/constables.html From forax at univ-mlv.fr Mon Mar 12 23:27:01 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 13 Mar 2018 00:27:01 +0100 (CET) Subject: Updated Constables and constant folding doc In-Reply-To: <1485976632.434870.1520896243972.JavaMail.zimbra@u-pem.fr> References: <7fb7f95b-6f0e-00c0-cba5-6fbae508a649@oracle.com> <1485976632.434870.1520896243972.JavaMail.zimbra@u-pem.fr> Message-ID: <1721365225.435384.1520897221589.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Remi Forax" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 13 Mars 2018 00:10:43 > Objet: Re: Updated Constables and constant folding doc > Hi Brian, > the way the constant folding is done in this proposal scare me a lot, > pre-linking is usually a bad idea when done in the compiler because it means > that the world the compiler sees and the world VM sees has to be exactly the > same, > and there is nothing in the spec that guarantee that the compiler and the VM are > the same. > > I've already be beaten badly by smart compilers in the past, if you have a bug > in one constant folded method, you have as many bugs as the number of class > files generated by the compiler. > > We already have that bug that static final constants that are resolved at > compile time instead of at runtime. i see your proposal as a way to add more > bugs. > > We have ConstantDynamic which is a way to do an arbitrary evaluation of a value > at runtime and consider it as a constant. This has the advantage of doing the > constant folding at runtime. Doing the constant folding at runtime is obviously > slower but it eliminates a whole class of bugs. > It works that way, when a method is marked at @ConstantFoldable, the compiler > will create a synthetic method (like with a lambda) that will be used to > initialize the condy. > By example, > String s = `a long multi- > line string`.trimIndent(); > is compiled to > ldc Condy String ConstantFoldableMetaFactory [constantFoldable$0 as a constant > method handle] > with > private static String constantFoldable$0() { > return `a long multi- > line string`.trimIndent(); > } > > with the bsm of ConstantFoldableMetaFactory calling the constant method handle > sent as bootstrap argument after having verified it was created from the same > lookup as the one sent as first argument of the BSM. > > This also avoid to have the compiler executing an arbitrary code. > > Note: in the section decapturing of lambda, the 'prefix' is called 'p' inside of > the lambda. > and i forget to say to jlink can be used to really do the constant folding by simplifying the ldc condy if the BSM is the one from ConstantFoldableMetaFactory, because at jlink time, you know the exact version of Java that will be used at runtime. > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "amber-spec-experts" >> Envoy?: Lundi 12 Mars 2018 22:53:21 >> Objet: Updated Constables and constant folding doc > >> I've updated the document on the ConstantRef / Constable API, and added >> more detail on constant folding and propagation optimizations: >> > > ??? http://cr.openjdk.java.net/~briangoetz/amber/constables.html From brian.goetz at oracle.com Mon Mar 12 23:44:15 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Mar 2018 19:44:15 -0400 Subject: Updated Constables and constant folding doc In-Reply-To: <1485976632.434870.1520896243972.JavaMail.zimbra@u-pem.fr> References: <7fb7f95b-6f0e-00c0-cba5-6fbae508a649@oracle.com> <1485976632.434870.1520896243972.JavaMail.zimbra@u-pem.fr> Message-ID: <44DEEB68-4DD7-4C8E-BF87-CC78BEA5563B@oracle.com> > the way the constant folding is done in this proposal scare me a lot, It should scare you, it means you are reading ... > pre-linking is usually a bad idea when done in the compiler because it means that the world the compiler sees and the world VM sees has to be exactly the same, Yes, it means that the bar for @Foldable/Constable is very, very high. > and there is nothing in the spec that guarantee that the compiler and the VM are the same. It means that the implementation of the @F/C classes must guarantee this. Because of all the constraints, its use will be limited to java.base. From kevinb at google.com Tue Mar 13 19:32:51 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 13 Mar 2018 12:32:51 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman wrote: Simplifying: let's call normal cases in a switch simple if they're a single > statement or a no-op fallthrough, and let's call a default simple if it's a > single statement or it's not there at all. > > Among switches apparently convertible to expression switches, > > - 81% had all simple normal cases and a simple default. > - 5% had all simple normal cases and a nonsimple default. > - 12% had a nonsimple normal case and a simple default. > - 2% had a nonsimple normal case and a nonsimple default. > > I was surprised it was as high as 19%, so I grabbed a random sample of these 45 occurrences from Google's codebase and reviewed them. My goal was to find evidence that multi-statement cases in expression switches are important and common. Spoiler: I found said evidence underwhelming. There were 3 that I would call false matches (e.g. two that simply used a void `return` instead of `break` after every case without reason). There were fully 20 out of the remaining 42 that I quickly concluded should be refactored regardless of anything else, and where that refactoring happens to leave them with only simple cases and simple/no default. These refactorings were varied (hoist out code common to all non-exception cases; simplify unreachable code; change to `if` if only 1-2 cases; extract a method (needing only 1-2 parameters) for a case that is much bigger than the others; switch from loop to Streams; change `if/else` to ?:; move a precondition check to a more appropriate location; and a few other varied cleanups). Next there were 7 examples where the non-simple cases included side-effecting code, like setting fields or calling void methods. In Google Style I expect that we will probably forbid (or at least strongly dissuade) side effects in expression switch. I should probably bring this up separately, but I am pretty convinced by now that users should see expression switch and procedural switch as two completely different things, and by convention should always keep the former purely functional. Next there were 7 examples where a case was "non-simple" only because it was using the "log, then return a null object (or null), instead of throwing an exception" anti-pattern. I was surprised this was that popular. and another 2 that used the "log-and-also-throw" anti-pattern. 2 examples had a use-once local variable that saved a *little* bit of nesting. I wouldn't normally refactor these, but if expression switch had no mechanism for multi-statement cases, I wouldn't think twice about it. 1 example had cases that looked nearly identical, 3 statements each, that could all be hoisted out of the switch, except that the types that differed across the three didn't implement a common interface (as they clearly should have). Slightly compelling. 1 example had all simple cases except that one also wanted to check an assertion. Okay, slightly compelling. Finally, the cases that were the most compelling to me: 3 examples had one or more large cases, where factoring them out into helper methods would imho be ugly because >=3 parameters would be required. If expression switch didn't permit multi-statement cases, I would just keep them as procedural switches. It's only 3 out of 42. Summary: imho, early signs suggest that the grossness of `break x` is not *nearly* justified by the actual observed positive value of supporting multi-statement cases in expression switch. Are we open to killing that, or would we be if I produced more and clearer evidence? > > On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz wrote: > >> Did you happen to calculate what percentage was _not_ the "default" >> case? I would expect that to be a considerable fraction. >> >> On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: >> >> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax wrote: >> >> When i asked what we should do instead, the answer is either: >>> 1/ we should not allow block of codes in the expression switch but >>> only expression >>> 2/ that we should use the lambda syntax with return, even if the >>> semantics is different from the lambda semantics. >>> >>> I do not like (1) because i think the expression switch will become >>> useless >> >> >> In our (large) codebase, +Louis determined that, among switch statements >> that appear translatable to expression switch, 13.8% of them seem to >> require at least one multi-statement case. >> >> >> -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Tue Mar 13 20:02:15 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 13 Mar 2018 13:02:15 -0700 Subject: expression switch vs. procedural switch Message-ID: I haven't been part of all the discussions that have led to this point, and also am not a Language Person, so I apologize for the things I'm missing in the following. The more I have thought about it, the more I believe that 95% of the entire value of expression switch is that it *isn't procedural switch*, and is easier to reason about than procedural switch because of all things it *can't *do: - can't miss cases - can't return - can't break/continue a containing construct - can't fall through - (for constants or other disjoint patterns) can't depend on the order of cases. As far as I can tell, its limitations are exactly what make it useful. (It isn't really a large savings of code bulk, and it's not that we really want to see switches appearing in expression contexts in general besides the key ones `var = ??` and `return ??`.) I also believe that all these limitations work to support the notion that an expression switch is *functional* in nature. It is a function defined "in parts" (and immediately executed). As such, I believe we should discourage using expression switch in side-effecting ways. More to the point, I suggest that side-effecting use cases should not be seen as especially motivating for our design decisions (see e.g. my message 30 minutes ago in "break seen as a C archaism"). If we can manage to get in sync on the above (?), then I'd like to further argue that we should *leave procedural switch alone* (with the *possible* exception of enabling `case A, B` because that seems so very harmless, but *meh*). I believe anything we do try to to smooth out differences between procedural and expression switch and make them "really just the same thing" directly works *against* the value proposition above. It can't be a simpler thing if it is the same thing. And, of course, procedural switch has been around and unchanged (except for new types) since the very beginning. If we ever manage to agree on this much (?), then we can ask ourselves if we should still use `switch` and `case` in expression switch or go with a separate syntax. Again, sorry if I'm retreading ground that is already considered well trod. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 13 20:18:45 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Mar 2018 16:18:45 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: Thanks for the detailed analysis.? I'm glad to see that a larger percentage could be converted to expression switch given sufficient effort; I'm also not surprised to see that a lot had accidental reasons that kept them off the happy path.? And, your analysis comports with expectations in another way: that some required more significant intervention than others to lever them into compliance.? For example, the ones with side-effecting activities or logging, while good candidates for "strongly dissuade", happen in the real world, and not all codebases have the level of discipline or willingness to refactor that yours does.? So I'm not sure I'm ready to toss either of the size-7 sub-buckets aside so quick; not everyone is as sanguine as "well, go refactor then" as you are. Which is to say, adjusting the data brings the simple bucket from 86% (which seemed low) to 93-95%, which is "most of the time" but not "all of the time".? So most of the time, the code will not need whatever escape mechanism (break or block), but it will often enough that having some escape hatch is needed. You didn't mention fallthrough, but because that's everyone's favorite punching bag here, I'll add: most of the time, fallthrough is not needed in expression switches, but having reviewed a number of low-level switches in the JDK, it is sometimes desirable even for switches that can be converted to expressions.? One of the motivations for refining break in this way is so that when fallthrough is needed, the existing idiom just works as everyone understands it to. > imho, early signs suggest that the grossness of `break x` is not > /nearly/ justified by the actual observed positive value of supporting > multi-statement cases in expression switch. Are we open to killing > that, or would we be if I produced more and clearer evidence? That's one valid interpretation of the data, but not the only. Whether making break behave more like return (takes a value in non-void context, doesn't take one in void context) is gross or natural is subjective.? Here's my subjective interpretation: "great, most of the time, you don't have to use the escape hatch, but when you do, it has control flow just like the break we know, extended to take a value in non-void contexts, so will be fairly familiar." But setting aside subjective reactions, are there better alternatives?? Let's review what has been considered already, and why they've been passed over: ?- Do nothing; only allow single expressions.? Non-starter. ?- Traditional "block expressions"; { S; S; e }.? Terrible fit for Java, so no. ?- Some other form of block expression.? Seems a very big hammer for a small problem, which will surely interact with other features, and will likely call for follow-ons of its own. ?- Some sort of bespoke "block expression for switch". On the latter, one obvious choice is something lambda-like: ??? case 1 -> 1; ??? case 2 -> { println("two"); return 2; } You might argue that this is familiar because it's using `return` just like lambda, but ... yuck.? Lambdas are their own invocation scope, so `return` can be twisted into making sense, but the block of a switch is not, so `return` is definitely the wrong word here.? (Arguably it was the wrong word for lambdas too; had someone suggested `break` at the right time back then I would probably have been pretty compelled by this suggestion, but we picked `return` early (when we were still caught up in "lambdas are sugar for inner classes") and didn't look back.? Oh well.) But it really seems like a bridge too far to use `return` here. The obvious alternative, then, is ... break: ??? case 1 -> 1; ??? case 2 -> { println("two"); break 2; } But that is pretty similar to what we have now, just with braces.? If the concern is that we're stretching `break` too far, then this is just as bad. Worse, it has two significant additional downsides: 1.? You can't fall through at all.? (Yes, I know some people think this is an upside.)? But real code does use fallthrough, and this leaves them without any alternative; it also widens the asymmetry of expression switch vs statement switch.? (Combine this with other suggestions that widen the asymmetry between pattern and non-pattern switch, and you have four switch constructs.? Oops.) 2.? Either you can only use these block expressions in switch, in which case people hate us for one reason, or you can use them everywhere, and they hate us for another.? (I have a hard time imagining that this doesn't run into conflicts with other contexts in which one could use break (how could it not), plus, I don't think this is the block expression idiom I want in the language anyway.) So it seems like a half-measure that is worse on nearly every metric. There might be other alternatives, but I don't see a better one, other than deprecating switch and designing a whole new mechanism.? Which, while I understand the attraction of, I don't think that's doing the users a favor either. And, to defend what we've proposed: it's exactly the switch we all know, warts and all.? Very little new; very little in the way of asymmetry between void/value and pattern/constant.? The cost is that we have to accept the existing warts, primarily the weird block expression (blocks of statements with break not surrounded by braces), the weird scoping, and fallthrough. This choice reminds me of the old Yiddish proverb of the Tree of Sorrows.? (https://www.inspirationalstories.com/0/69.html). If you've got something better ... On 3/13/2018 3:32 PM, Kevin Bourrillion wrote: > On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman > wrote: > > Simplifying: let's call normal cases in a switch simple if they're > a single statement or a no-op fallthrough, and let's call a > default simple if it's a single statement or it's not there at all. > > Among switches apparently convertible to expression switches, > > * 81% had all simple normal cases and a simple default. > * 5% had all simple normal cases and a nonsimple default. > * 12% had a nonsimple normal case and a simple default. > * 2% had a nonsimple normal case and a nonsimple default. > > I was surprised it was as high as 19%, so I grabbed a random sample of > these 45 occurrences from Google's codebase and reviewed them. My goal > was to find evidence that multi-statement cases in expression switches > are important and common. Spoiler: I found said evidence underwhelming. > > There were 3 that I would call false matches (e.g. two that simply > used a void `return` instead of `break` after every case without reason). > > There were fully 20 out of the remaining 42 that I quickly concluded > should be refactored regardless of anything else, and where that > refactoring happens to leave them with only simple cases and simple/no > default. These refactorings were varied (hoist out code common to all > non-exception cases; simplify unreachable code; change to `if` if only > 1-2 cases; extract a method (needing only 1-2 parameters) for a case > that is much bigger than the others; switch from loop to Streams; > change `if/else` to ?:; move a precondition check to a more > appropriate location; and a few other varied cleanups). > > Next there were 7 examples where the non-simple cases included > side-effecting code, like setting fields or calling void methods. In > Google Style I expect that we will probably forbid (or at least > strongly dissuade) side effects in expression switch. I should > probably bring this up separately, but I am pretty convinced by now > that users should see expression switch and procedural switch as two > completely different things, and by convention should always keep the > former purely functional. > > Next there were 7 examples where a case was "non-simple" only because > it was using the "log, then return a null object (or null), instead of > throwing an exception" anti-pattern. I was surprised this was that > popular. and another 2 that used the "log-and-also-throw" anti-pattern. > > 2 examples had a use-once local variable that saved a /little/?bit of > nesting. I wouldn't normally refactor these, but if expression switch > had no mechanism for multi-statement cases, I wouldn't think twice > about it. > > 1 example had cases that looked nearly identical, 3 statements each, > that could all be hoisted out of the switch, except that the types > that differed across the three didn't implement a common interface (as > they clearly should have). Slightly compelling. > > 1 example had all simple cases except that one also wanted to check an > assertion. Okay, slightly compelling. > > Finally, the cases that were the most compelling to me: 3 examples had > one or more large cases, where factoring them out into helper methods > would imho be ugly because >=3 parameters would be required. If > expression switch didn't permit multi-statement cases, I would just > keep them as procedural switches. It's only 3 out of 42. > > Summary: > > imho, early signs suggest that the grossness of `break x` is not > /nearly/ justified by the actual observed positive value of supporting > multi-statement cases in expression switch. Are we open to killing > that, or would we be if I produced more and clearer evidence? > > > > > > > On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz > wrote: > > Did you happen to calculate what percentage was _not_ the > "default" case?? I would expect that to be a considerable > fraction. > > On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: >> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax > > wrote: >> >> When i asked what we should do instead, the answer is either: >> ? 1/ we should not allow block of codes in the expression >> switch but only expression >> ? 2/ that we should use the lambda syntax with return, >> even if the semantics is different from the lambda semantics. >> >> I do not like (1) because i think the expression switch >> will become useless >> >> >> In our (large) codebase, +Louis determined that, among switch >> statements that appear translatable to expression switch, >> 13.8% of them seem to require at least one multi-statement case. >> > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Mar 13 21:21:37 2018 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Mar 2018 14:21:37 -0700 Subject: expression switch vs. procedural switch In-Reply-To: References: Message-ID: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> On Mar 13, 2018, at 1:02 PM, Kevin Bourrillion wrote: > > The more I have thought about it, the more I believe that 95% of the entire value of expression switch is that it isn't procedural switch, and is easier to reason about than procedural switch because of all things it can't do: > can't miss cases > can't return > can't break/continue a containing construct > can't fall through > (for constants or other disjoint patterns) can't depend on the order of cases. > As far as I can tell, its limitations are exactly what make it useful. (It isn't really a large savings of code bulk, and it's not that we really want to see switches appearing in expression contexts in general besides the key ones `var = ??` and `return ??`.) > > I also believe that all these limitations work to support the notion that an expression switch is functional in nature. It is a function defined "in parts" (and immediately executed). As such, I believe we should discourage using expression switch in side-effecting ways. More to the point, I suggest that side-effecting use cases should not be seen as especially motivating for our design decisions (see e.g. my message 30 minutes ago in "break seen as a C archaism"). These are all real issues but they don't all cut so uniformly in the direction you are seeking to uphold. I would refactor your list as: A. must complete with a value or throw; cannot complete with control flow (covers return/break/continue) B. is exhaustive (can't miss cases, can't fall through) C. when patterns are disjoint, case order is insignificant D. acts like a lambda body (is functional, no external side effects) For expression switches, A is truly a unique requirement. In Java expressions cannot complete with a branch; they must complete with a value or throw. I don't think any of us want to add a new kind of expression which suddenly can branch to a visible label or return, without the help of an enclosing branch statement. As for statements, they can branch, within limits (lambda bodies and method bodies). B is a requirement that is necessary for expression switches also, but it is also a desirable property of *some* statement switches. So there's some design work to do motivated by e-switches that will benefit s-switches. I'm thinking in particular of some simple way to certify that a switch (either kind) is intended to be exhaustive, asking the static and runtime systems to give suitable diagnostics if that ever fails. (Or is "fallthrough" the phenomenon where several case labels converge to one statement? I am doubtful that is what you mean because it seems expression switches are very likely to need to reply to several target values with one expression, just as with statement switches.) C is a tautology, so I'm not sure what it tells us. Are you saying that order-invariance is an important property for expressions but not statements? When we get to overlapping case labels (patterns) they will be equally welcome in s-switches and e-switches. D is an extension of A, upgrading the branch-free property to the absence of all side effects, making s-switches like lambdas. I do *not* think this is a realistic goal, not even for a highly disciplined shop like Google. Why? Because in Java expressions have lots of side effects. Consider: Object x = i < len ? a[i++] : null; Object x = it.hasNext() ? it.next() : null; Those expressions are two-branch conditionals (disguised if-statements) with side effects. They are not "functional" in any robust sense (the iterator is a shallow container for non-functional state just like a and i++). As soon as you have more than two branches to your conditional, you want a switch expression, and it may very well operate on ambient state, just like many other Java expressions: Object x = switch (len - i) { case 0 -> null; case 1 -> a[i++]; default: case 2 -> (a[i++] << 8) + a[i++]; }; The way I see it, D is undesirable, A is necessary to the physics of expressions but doesn't tell us anything about the nature of e-switch, and B and C apply to both kinds of switches. So there's nothing here that teaches us to treat e-switch as something with its own special mission defined by its limitations. Instead, I very much believe in Brian's design heuristic of running a refactoring exercise over switch use cases, to make sure that there is (when possible) an easy transition between s-switch and e-switch. The effect of this heuristic is to keep both switches aligned in their capabilities (where physics allow) lowering the learning burden, and making it easy for programmers to convert between the forms as needs and tastes require. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Mar 13 21:57:49 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 13 Mar 2018 22:57:49 +0100 (CET) Subject: expression switch vs. procedural switch In-Reply-To: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> References: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> Message-ID: <1580246980.901161.1520978269164.JavaMail.zimbra@u-pem.fr> Hi John, it's perhaps me, but i think that Kevin saying that he wants to discourage side effects, not disable them, avoiding side effects like it.next() is not possible in Java. We already have this kind of discussion when we have discussed about lambdas, lambdas allow side effects but it's discouraged to use lambdas for that. And We already have agree that case expressions do not need to capture variables, so expression in a case are like expressions in a ?:, you can do side effect in them but it's discouraged. R?mi > De: "John Rose" > ?: "Kevin Bourrillion" > Cc: "amber-spec-experts" > Envoy?: Mardi 13 Mars 2018 22:21:37 > Objet: Re: expression switch vs. procedural switch > On Mar 13, 2018, at 1:02 PM, Kevin Bourrillion < [ mailto:kevinb at google.com | > kevinb at google.com ] > wrote: >> The more I have thought about it, the more I believe that 95% of the entire >> value of expression switch is that it isn't procedural switch , and is easier >> to reason about than procedural switch because of all things it can't do: >> * can't miss cases >> * can't return >> * can't break/continue a containing construct >> * can't fall through >> * (for constants or other disjoint patterns) can't depend on the order of cases. >> As far as I can tell, its limitations are exactly what make it useful. (It isn't >> really a large savings of code bulk, and it's not that we really want to see >> switches appearing in expression contexts in general besides the key ones `var >> = ??` and `return ??`.) >> I also believe that all these limitations work to support the notion that an >> expression switch is functional in nature. It is a function defined "in parts" >> (and immediately executed). As such, I believe we should discourage using >> expression switch in side-effecting ways. More to the point, I suggest that >> side-effecting use cases should not be seen as especially motivating for our >> design decisions (see e.g. my message 30 minutes ago in "break seen as a C >> archaism"). > These are all real issues but they don't all cut so uniformly in > the direction you are seeking to uphold. I would refactor your > list as: > A. must complete with a value or throw; cannot complete with control flow > (covers return/break/continue) > B. is exhaustive (can't miss cases, can't fall through) > C. when patterns are disjoint, case order is insignificant > D. acts like a lambda body (is functional, no external side effects) > For expression switches, A is truly a unique requirement. In Java expressions > cannot complete with a branch; they must complete with a value or throw. > I don't think any of us want to add a new kind of expression which suddenly > can branch to a visible label or return, without the help of an enclosing branch > statement. As for statements, they can branch, within limits (lambda bodies > and method bodies). > B is a requirement that is necessary for expression switches also, but it is > also a desirable property of *some* statement switches. So there's some > design work to do motivated by e-switches that will benefit s-switches. > I'm thinking in particular of some simple way to certify that a switch > (either kind) is intended to be exhaustive, asking the static and runtime > systems to give suitable diagnostics if that ever fails. > (Or is "fallthrough" the phenomenon where several case labels converge > to one statement? I am doubtful that is what you mean because it seems > expression switches are very likely to need to reply to several target > values with one expression, just as with statement switches.) > C is a tautology, so I'm not sure what it tells us. Are you saying that > order-invariance is an important property for expressions but not > statements? When we get to overlapping case labels (patterns) > they will be equally welcome in s-switches and e-switches. > D is an extension of A, upgrading the branch-free property to the > absence of all side effects, making s-switches like lambdas. I do > *not* think this is a realistic goal, not even for a highly disciplined > shop like Google. Why? Because in Java expressions have lots > of side effects. Consider: > Object x = i < len ? a[i++] : null; > Object x = it.hasNext() ? it.next() : null; > Those expressions are two-branch conditionals (disguised > if-statements) with side effects. They are not "functional" in any > robust sense (the iterator is a shallow container for non-functional > state just like a and i++). > As soon as you have more than two branches to your conditional, > you want a switch expression, and it may very well operate on > ambient state, just like many other Java expressions: > Object x = switch (len - i) { > case 0 -> null; > case 1 -> a[i++]; > default: > case 2 -> (a[i++] << 8) + a[i++]; > }; > The way I see it, D is undesirable, A is necessary to the physics > of expressions but doesn't tell us anything about the nature of > e-switch, and B and C apply to both kinds of switches. So > there's nothing here that teaches us to treat e-switch as > something with its own special mission defined by its limitations. > Instead, I very much believe in Brian's design heuristic of > running a refactoring exercise over switch use cases, to make > sure that there is (when possible) an easy transition between > s-switch and e-switch. The effect of this heuristic is to keep > both switches aligned in their capabilities (where physics allow) > lowering the learning burden, and making it easy for programmers > to convert between the forms as needs and tastes require. > ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Mar 13 22:46:05 2018 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Mar 2018 15:46:05 -0700 Subject: expression switch vs. procedural switch In-Reply-To: <1580246980.901161.1520978269164.JavaMail.zimbra@u-pem.fr> References: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> <1580246980.901161.1520978269164.JavaMail.zimbra@u-pem.fr> Message-ID: On Mar 13, 2018, at 2:57 PM, Remi Forax wrote: > > it's perhaps me, but i think that Kevin saying that he wants to discourage side effects, not disable them, avoiding side effects like it.next() is not possible in Java. > > We already have this kind of discussion when we have discussed about lambdas, lambdas allow side effects but it's discouraged to use lambdas for that. > And We already have agree that case expressions do not need to capture variables, so expression in a case are like expressions in a ?:, you can do side effect in them but it's discouraged. I get that. I'm just pointing out that, though "functional" is a great code style heuristic, expressions which occur inside of imperative loops will have to get themselves dirty as they work with loop control variables. Since imperative loops aren't going away in the foreseeable future, we have to envision side effects inside expressions. This BTW is another similarity between s-switches and e-switches: They both have a legitimate need to put side effects into local variables. With s-switches, it is the *only* way to export a value. With e-switches, *one* value can be exported the nice way via ->. But with *both* switches it will be common (as it is common now) to export *multiple* side effects (or one value and one side effect). My proof point of this is the ubiquity of the pattern a[i++], which delivers both a value and a side effect. More detail: Sometimes I write switches which export two or more values: String s; int n; switch (tag) { case 0: s = ""; n = -1; break; case 1: s = "one"; n = 0; break; default: s = "many"; n = 1; break; } As an extension to my point about a[i++], I would expect the freedom to refactor that as: String s; int n = switch (tag) { case 0: s = ""; break -1; case 1: s = "one"; break 0; default: s = "many"; break 1; } There is a flexible set of variations of this multiple-value delivery. The effect into s as a blank variable could just as easily have been s += "one", where s has a previous value; that's closer to the i++ in a[i++]. You will say "yuck", so do I. The e-switch refactoring above arguably reduces the yuck level from the s-switch. And there are probably pretty variations waiting to be written, if we allow them. ? John P.S. For thoughts on value-types as cursors, see: http://cr.openjdk.java.net/~jrose/values/iterator-vs-cursor.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Tue Mar 13 23:47:45 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 13 Mar 2018 23:47:45 +0000 Subject: expression switch vs. procedural switch In-Reply-To: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> References: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> Message-ID: On Tue, Mar 13, 2018 at 2:21 PM John Rose wrote: (Or is "fallthrough" the phenomenon where several case labels converge > to one statement? I am doubtful that is what you mean because it seems > expression switches are very likely to need to reply to several target > values with one expression, just as with statement switches.) > Nope, I just consider that to be multiple labels on a statement group, as implied by JLS 14.11. C is a tautology, so I'm not sure what it tells us. Are you saying that > order-invariance is an important property for expressions but not > statements? > Yeah, it follows from the absence of fall-through, and I retract listing it. :-) I was just in the mode of noting various ways we like that expression switches are simple. > D is an extension of A, upgrading the branch-free property to the > absence of all side effects, making s-switches like lambdas. I do > *not* think this is a realistic goal, not even for a highly disciplined > shop like Google. > I have no illusions of preventing side effects. I simply don't place much value on those use cases when evaluating usability of the feature. This was really more relevant to my other thread. The way I see it, D is undesirable, A is necessary to the physics > of expressions but doesn't tell us anything about the nature of > e-switch, and B and C apply to both kinds of switches. So > there's nothing here that teaches us to treat e-switch as > something with its own special mission defined by its limitations. > I think "B and C apply to both" is an oversimplification for both B and C. Only expression switches are exhaustive (whether via default or not) by their very nature. And case order matters when fall-through can exist. However, even if this argument is entirely valid, it still seems to be more relevant to a scenario where we get to design "both kinds" of switch at once in a brand new language. The fact that procedural switch has already been around 20+ years is the major reason why I'm advocating that we leave it alone. > Instead, I very much believe in Brian's design heuristic of > running a refactoring exercise over switch use cases, to make > sure that there is (when possible) an easy transition between > s-switch and e-switch. > I would add "for the subset of switch statements that we consider to be expression-shaped", but maybe this is where we have some disagreement? -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 14 00:23:44 2018 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Mar 2018 17:23:44 -0700 Subject: expression switch vs. procedural switch In-Reply-To: References: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> Message-ID: <584392BF-DC6D-4321-B917-42C281918165@oracle.com> On Mar 13, 2018, at 4:47 PM, Kevin Bourrillion wrote: > > I think "B and C apply to both" is an oversimplification for both B and C. > Only expression switches are exhaustive (whether via default or not) by their very nature. That's strictly true only in today's language, where it is also true that there's no such thing as an expression switch. In the future, exhaustiveness is an interesting possibility for statement switches also. For legacy reasons (unless we redesign from scratch) it will require a special marking on s-switches, where it will be a requirement of e-switches. That kind of nod to legacy is usually not big enough to warrant shifting our scope to a from-scratch e-switch. Although it will be a pebble in the shoe. Anyway, the different you are noting in e-switches is not that deep. You get the same effects (required exhaustivness) if you refactor the e-switch to an s-switch, with an assignment to a blank local as the mechanism for exporting the value you want. In that case, exhaustiveness checks are in the current language, as DA rules. (And note that if an e-switch delivers a second value, it will play with DA rules, just like an s-switch that delivers 2 values. So DA rules are common to both switches.) Also, even for e-switches an explicit exhaustiveness marker is desirable. For both kinds of switches, there are cases where the user wants to say, "I know I have covered all the possibilities" and thus turn off the effects of exhaustiveness checking, including DA rules and whatever DA-like rule we define for e-switches. Summary: No, I don't buy that e-switches are so special. > And case order matters when fall-through can exist. Case order will also matter with pattern switches. Sometimes cases won't even be linearly orderable, when multiple supertyping is in play. Thats a much stronger reason to watch for case order than fallthough (which is pretty terrible). So when I see a bunch of case statements in legacy code, I say to myself, "how nice that those guys are all disjoint, so I can edit my code the way I want, and the compiler can do an O(1) lookup". I don't see a different kind of switch than something where cases overlap. In that latter case, the compiler might still manage an O(1) lookup, and the compiler/IDE will guide me to prevent misordering. > However, even if this argument is entirely valid, it still seems to be more relevant to a scenario where we get to design "both kinds" of switch at once in a brand new language. The fact that procedural switch has already been around 20+ years is the major reason why I'm advocating that we leave it alone. > > > Instead, I very much believe in Brian's design heuristic of > running a refactoring exercise over switch use cases, to make > sure that there is (when possible) an easy transition between > s-switch and e-switch. > > I would add "for the subset of switch statements that we consider to be expression-shaped", but maybe this is where we have some disagreement? I regard the bridge between them as very wide. You might be more selective. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 14 00:27:20 2018 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Mar 2018 17:27:20 -0700 Subject: expression switch vs. procedural switch In-Reply-To: <584392BF-DC6D-4321-B917-42C281918165@oracle.com> References: <09BDB960-25E4-4D4B-BCAA-6E0729DD634F@oracle.com> <584392BF-DC6D-4321-B917-42C281918165@oracle.com> Message-ID: <5D910B2C-4E48-4BA6-ACA5-B7CE861AF65E@oracle.com> On Mar 13, 2018, at 5:23 PM, John Rose wrote: > > Also, even for e-switches an explicit exhaustiveness marker is > desirable. For both kinds of switches, there are cases where > the user wants to say, "I know I have covered all the possibilities" > and thus turn off the effects of exhaustiveness checking, including > DA rules and whatever DA-like rule we define for e-switches. P.S. You probably surmised, correctly, that this involves injecting an implicit exception throw, at an implicit default. Whether the compiler asserts exhaustivness from type analysis, or the user asserts it just because, the runtime has to have a backup plan to throw a MatchError or SurpriseInSwitchError, to keep the JVM's verifier happy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 00:43:23 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 00:43:23 +0000 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: Sorry for 5,000 inline replies. On Tue, Mar 13, 2018 at 1:18 PM Brian Goetz wrote: Thanks for the detailed analysis. I'm glad to see that a larger percentage > could be converted to expression switch given sufficient effort; > Just to be clear: the sample was only from the 19% already identified as convertible, so in *this* discussion that number is only going down. However, the important point is that we should separately investigate the 81% to see whether a few simple heuristics make them recognizable as e-switches too. > I'm also not surprised to see that a lot had accidental reasons that kept > them off the happy path. And, your analysis comports with expectations in > another way: that some required more significant intervention than others > to lever them into compliance. For example, the ones with side-effecting > activities or logging, while good candidates for "strongly dissuade", > happen in the real world, and not all codebases have the level of > discipline or willingness to refactor that yours does. So I'm not sure I'm > ready to toss either of the size-7 sub-buckets aside so quick; not everyone > is as sanguine as "well, go refactor then" as you are. > Okay, I want to clarify a few things about this style of research, which is how we have been evaluating API decisions for Guava and our other libraries for a long time, but which I skipped really explaining. First, we are using "existing code could refactor to" as a proxy for "what might they probably write anew today if they could". It's a conceit, but a useful one since the corpus of existing code has the benefit of being visible and analyzable. :-) So, arguments that we make don't necessarily rest on whether actual users *will actually refactor*. (Also, I mean, if they aren't willing to refactor *at all*, then they wouldn't be changing to expression switch at all so they'd be irrelevant to us anyway, but this is not the real point.) Second, we believe that the *most* useful (not *only*) way to judge the value of a feature lies in comparing the *best code* users can write without that feature to the *best code* they can write with it. Then we look at two factors: (a) how much better did the feature make this code, and (b) how commonly do we think this case comes up in real life. Roughly speaking we multiply (a) and (b) together (for which I have with questionable taste attempted to coin the phrase "utility times ubiquity") and that gives about how much we care about making the change. This type of analysis has driven most decisions about the shape of Guava's API. It's not the only kind of argument that can be made. One can also add "it might not make a large amount of 'great' code 'greater', but people who write mediocre code will be more likely to write decent code!" I just think that we should put less stock in such arguments than the other kind, because this sounds like a problem better addressed with static analysis tools, education, evangelism, and whatnot. > Which is to say, adjusting the data brings the simple bucket from 86% > (which seemed low) to 93-95%, which is "most of the time" but not "all of > the time". So most of the time, the code will not need whatever escape > mechanism (break or block), but it will often enough that having some > escape hatch is needed. > We think the ability to stick with procedural switch is that escape hatch already. You didn't mention fallthrough, but because that's everyone's favorite > punching bag here, I'll add: most of the time, fallthrough is not needed in > expression switches, but having reviewed a number of low-level switches in > the JDK, it is sometimes desirable even for switches that can be converted > to expressions. One of the motivations for refining break in this way is > so that when fallthrough is needed, the existing idiom just works as > everyone understands it to. > My assumption is that that code can just keep doing what it's already doing. My claim is that there is only value to changing to expression switch if we are getting the benefit of how much more simple and constrained it is. imho, early signs suggest that the grossness of `break x` is not *nearly* > justified by the actual observed positive value of supporting > multi-statement cases in expression switch. Are we open to killing that, or > would we be if I produced more and clearer evidence? > > > That's one valid interpretation of the data, but not the only. Whether > making break behave more like return (takes a value in non-void context, > doesn't take one in void context) is gross or natural is subjective. > Here's my subjective interpretation: "great, most of the time, you don't > have to use the escape hatch, but when you do, it has control flow just > like the break we know, extended to take a value in non-void contexts, so > will be fairly familiar." > I think that there are features that make sense on their own, and there are features that *totally make lots of sense* assuming that you have heard the expert group's passionate explanation of why they make sense. (It reminds me of a certain Pied Piper focus group near the end of Silicon Valley season 2, but moving on.) I am concerned that "breaking a value" is of the second kind. But setting aside subjective reactions, are there better alternatives? > Let's review what has been considered already, and why they've been passed > over: > > - Do nothing; only allow single expressions. Non-starter. > We're just saying the feature seems to be at least 90% as applicable without it. Roughly. Why is it a non-starter for the other 10% to stick with the switch they've always had? I'm sure there are good answers to that, I'm not doubting there are, but I think we should explore them instead of just declaring something a non-starter by fiat. case 1 -> 1; > case 2 -> { println("two"); break 2; } > > But that is pretty similar to what we have now, just with braces. If the > concern is that we're stretching `break` too far, then this is just as > bad. > > Worse, it has two significant additional downsides: > > 1. You can't fall through at all. (Yes, I know some people think this is > an upside.) > Yes! That! That's what we want. No fall-through. > But real code does use fallthrough, and this leaves them without any > alternative; it also widens the asymmetry of expression switch vs statement > switch. > Well, the other thread I started today is me literally *asking* for asymmetry between this and statement switch. If we stop using the `switch` keyword, so much the better. What are the motivating use cases for fall-through in expression switch? These must be exclusively examples featuring side-effects, right? Or is there a way for a case to access the result produced by the previous one and build on it? (Combine this with other suggestions that widen the asymmetry between > pattern and non-pattern switch, and you have four switch constructs. > Oops.) > (Not familiar with that stuff yet.) > There might be other alternatives, but I don't see a better one, other > than deprecating switch and designing a whole new mechanism. > I'm confused. `switch` has worked the same way for 20+ years; what could possibly motivate us to deprecate it? > And, to defend what we've proposed: it's exactly the switch we all know, > warts and all. Very little new; very little in the way of asymmetry > between void/value and pattern/constant. > (My response to this is already teed up in the other thread. Basically, it says that if we don't make expression switch suitably constrained then I have so far failed to grasp what its value is at all.) > The cost is that we have to accept the existing warts, primarily the > weird block expression (blocks of statements with break not surrounded by > braces), the weird scoping, and fallthrough. > > This choice reminds me of the old Yiddish proverb of the Tree of Sorrows. > (https://www.inspirationalstories.com/0/69.html). > Tangent, but I think that story actually advocates that we stick with exactly the switch statement we already have today. > If you've got something better ... > > > > On 3/13/2018 3:32 PM, Kevin Bourrillion wrote: > > On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman > wrote: > > Simplifying: let's call normal cases in a switch simple if they're a >> single statement or a no-op fallthrough, and let's call a default simple if >> it's a single statement or it's not there at all. >> >> Among switches apparently convertible to expression switches, >> >> - 81% had all simple normal cases and a simple default. >> - 5% had all simple normal cases and a nonsimple default. >> - 12% had a nonsimple normal case and a simple default. >> - 2% had a nonsimple normal case and a nonsimple default. >> >> I was surprised it was as high as 19%, so I grabbed a random sample of > these 45 occurrences from Google's codebase and reviewed them. My goal was > to find evidence that multi-statement cases in expression switches are > important and common. Spoiler: I found said evidence underwhelming. > > There were 3 that I would call false matches (e.g. two that simply used a > void `return` instead of `break` after every case without reason). > > There were fully 20 out of the remaining 42 that I quickly concluded > should be refactored regardless of anything else, and where that > refactoring happens to leave them with only simple cases and simple/no > default. These refactorings were varied (hoist out code common to all > non-exception cases; simplify unreachable code; change to `if` if only 1-2 > cases; extract a method (needing only 1-2 parameters) for a case that is > much bigger than the others; switch from loop to Streams; change `if/else` > to ?:; move a precondition check to a more appropriate location; and a few > other varied cleanups). > > Next there were 7 examples where the non-simple cases included > side-effecting code, like setting fields or calling void methods. In Google > Style I expect that we will probably forbid (or at least strongly dissuade) > side effects in expression switch. I should probably bring this up > separately, but I am pretty convinced by now that users should see > expression switch and procedural switch as two completely different things, > and by convention should always keep the former purely functional. > > Next there were 7 examples where a case was "non-simple" only because it > was using the "log, then return a null object (or null), instead of > throwing an exception" anti-pattern. I was surprised this was that popular. > and another 2 that used the "log-and-also-throw" anti-pattern. > > 2 examples had a use-once local variable that saved a *little* bit of > nesting. I wouldn't normally refactor these, but if expression switch had > no mechanism for multi-statement cases, I wouldn't think twice about it. > > 1 example had cases that looked nearly identical, 3 statements each, that > could all be hoisted out of the switch, except that the types that differed > across the three didn't implement a common interface (as they clearly > should have). Slightly compelling. > > 1 example had all simple cases except that one also wanted to check an > assertion. Okay, slightly compelling. > > Finally, the cases that were the most compelling to me: 3 examples had one > or more large cases, where factoring them out into helper methods would > imho be ugly because >=3 parameters would be required. If expression switch > didn't permit multi-statement cases, I would just keep them as procedural > switches. It's only 3 out of 42. > > Summary: > > imho, early signs suggest that the grossness of `break x` is not *nearly* > justified by the actual observed positive value of supporting > multi-statement cases in expression switch. Are we open to killing that, or > would we be if I produced more and clearer evidence? > > > > > >> >> On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz >> wrote: >> >>> Did you happen to calculate what percentage was _not_ the "default" >>> case? I would expect that to be a considerable fraction. >>> >>> On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: >>> >>> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax wrote: >>> >>> When i asked what we should do instead, the answer is either: >>>> 1/ we should not allow block of codes in the expression switch but >>>> only expression >>>> 2/ that we should use the lambda syntax with return, even if the >>>> semantics is different from the lambda semantics. >>>> >>>> I do not like (1) because i think the expression switch will become >>>> useless >>> >>> >>> In our (large) codebase, +Louis determined that, among switch statements >>> that appear translatable to expression switch, 13.8% of them seem to >>> require at least one multi-statement case. >>> >>> >>> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 00:59:28 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 13 Mar 2018 17:59:28 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: On Tue, Mar 13, 2018 at 5:43 PM, Kevin Bourrillion wrote: But setting aside subjective reactions, are there better alternatives? >> Let's review what has been considered already, and why they've been passed >> over: >> >> - Do nothing; only allow single expressions. Non-starter. >> > > We're just saying the feature seems to be at least 90% as applicable > without it. Roughly. Why is it a non-starter for the other 10% to stick > with the switch they've always had? I'm sure there are good answers to > that, I'm not doubting there are, but I think we should explore them > instead of just declaring something a non-starter by fiat. > Also, if it is true that this is a "non-starter", I would assume it is also a non-starter to only allow single expressions in the conditional operator `?:`. If not, what is the fundamental difference? We normally don't get to embed statements inside expressions, except in the case of anonymous classes and lambdas, where we (a) they must be set off with curly braces, and (b) they are only embedded physically, and don't immediately execute. If we do this for switch, we should at least stick with (a), but (b) is a thing with no precedent. It seems reasonable that we should require some very solid motivation before breaking that precedent. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 01:38:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Mar 2018 21:38:21 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> There are three arguments why the N case is significantly different from the 2 case. There are a number of idioms that require statements in addition to an expression. Debugging printfs, objects that take statements to initialize (construct/set/set/break), incrementing counters, cases that require side conditions (if today is tuesday do one thing, otherwise another), etc. Each individually is rare-ish, but not all that rare. The "static applicability" argument is that the larger the number of cases, the more likely one of them will fall into one these buckets, and then the whole thing has to fall back to statements. This makes expression switches less useful, and falling off this cliff is likely to irritate users every time it happens. The ?dynamic applicability? argument is that, if you want to change an existing switch (say, to add a debugging printf in one path), you have to refactor the whole thing. Which will be met, by users, with ?YGBFKM.? The ?cliff height? argument says that falling off the cliff on a two-way conditional and having to refactor to if-else is far less painful than falling off the cliff on an N-way switch. Its a more painful refactor. So for all these reasons, not being able to occasionally include some statements means many more switches that can?t use the feature (which is safer, clearer, and more compact), and also more often that the user will have to gratuitously refactor perfectly good code as they make small changes. > On Mar 13, 2018, at 8:59 PM, Kevin Bourrillion wrote: > > - Do nothing; only allow single expressions. Non-starter. > > We're just saying the feature seems to be at least 90% as applicable without it. Roughly. Why is it a non-starter for the other 10% to stick with the switch they've always had? I'm sure there are good answers to that, I'm not doubting there are, but I think we should explore them instead of just declaring something a non-starter by fiat. > > Also, if it is true that this is a "non-starter", I would assume it is also a non-starter to only allow single expressions in the conditional operator `?:`. If not, what is the fundamental difference? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 02:00:46 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Mar 2018 22:00:46 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> Message-ID: I get what you?re looking for, I really do. Existing switch has a lot of warts, some of which could be avoided with a new expression switch construct. Avoiding warts seems like a good idea, and fall through is _even wartier_ with expression switches than statement switches. (Not sure how you quantify wartiness. Frogs per KLoC?) But, the problem is that if we make an expression switch construct, now we have two switch constructs. They?re similar, and you will frequently want to refactor between them, but they?re subtly different. And let?s not assume that the new construct will not have warts, so now that?s two different sets of warts the user has to keep in their head. Users who arrive at Java will ask, ?why are there two subtly different ways to do basically the same thing?? I don?t think that?s necessarily doing anyone a favor. If the warts were fatal, sure, that?s what we?d do, but they?re not. Mostly, it?s annoying that we?re stuck with them for another 20 years. While of course I may be too close to it, I think there?s a very nice balance in the current proposal. In addition to control flow working exactly the same across the two forms, which I think does reduce the complexity of the language, the shorthand for ?case X -> e? protects users from the most warty aspects almost all the time: - the at-least-at-first weirdness of ?break value?; - the need to say break at all; - the risk of accidental fall through. But if you need either explicit break or fall-through, they?re there, and they work just like the break you?ve always known, for better or worse. Just ?break? the glass. You?re saying ?it wouldn?t be that bad if switch expressions had no fall through and only could take single expressions on the RHS, no statements.? Well, if you restrict yourself to the -> syntactic form, you get exactly that! So the only difference is that the escape hatch exists, if you?re willing to use it. But if you?re not willing to use it, they you get exactly what you asked for. > On Mar 13, 2018, at 8:43 PM, Kevin Bourrillion wrote: > > Sorry for 5,000 inline replies. > > > On Tue, Mar 13, 2018 at 1:18 PM Brian Goetz > wrote: > > Thanks for the detailed analysis. I'm glad to see that a larger percentage could be converted to expression switch given sufficient effort; > > Just to be clear: the sample was only from the 19% already identified as convertible, so in this discussion that number is only going down. However, the important point is that we should separately investigate the 81% to see whether a few simple heuristics make them recognizable as e-switches too. > > > I'm also not surprised to see that a lot had accidental reasons that kept them off the happy path. And, your analysis comports with expectations in another way: that some required more significant intervention than others to lever them into compliance. For example, the ones with side-effecting activities or logging, while good candidates for "strongly dissuade", happen in the real world, and not all codebases have the level of discipline or willingness to refactor that yours does. So I'm not sure I'm ready to toss either of the size-7 sub-buckets aside so quick; not everyone is as sanguine as "well, go refactor then" as you are. > > Okay, I want to clarify a few things about this style of research, which is how we have been evaluating API decisions for Guava and our other libraries for a long time, but which I skipped really explaining. > > First, we are using "existing code could refactor to" as a proxy for "what might they probably write anew today if they could". It's a conceit, but a useful one since the corpus of existing code has the benefit of being visible and analyzable. :-) So, arguments that we make don't necessarily rest on whether actual users will actually refactor. (Also, I mean, if they aren't willing to refactor at all, then they wouldn't be changing to expression switch at all so they'd be irrelevant to us anyway, but this is not the real point.) > > Second, we believe that the most useful (not only) way to judge the value of a feature lies in comparing the best code users can write without that feature to the best code they can write with it. Then we look at two factors: (a) how much better did the feature make this code, and (b) how commonly do we think this case comes up in real life. Roughly speaking we multiply (a) and (b) together (for which I have with questionable taste attempted to coin the phrase "utility times ubiquity") and that gives about how much we care about making the change. This type of analysis has driven most decisions about the shape of Guava's API. > > It's not the only kind of argument that can be made. One can also add "it might not make a large amount of 'great' code 'greater', but people who write mediocre code will be more likely to write decent code!" I just think that we should put less stock in such arguments than the other kind, because this sounds like a problem better addressed with static analysis tools, education, evangelism, and whatnot. > > > Which is to say, adjusting the data brings the simple bucket from 86% (which seemed low) to 93-95%, which is "most of the time" but not "all of the time". So most of the time, the code will not need whatever escape mechanism (break or block), but it will often enough that having some escape hatch is needed. > > We think the ability to stick with procedural switch is that escape hatch already. > > > You didn't mention fallthrough, but because that's everyone's favorite punching bag here, I'll add: most of the time, fallthrough is not needed in expression switches, but having reviewed a number of low-level switches in the JDK, it is sometimes desirable even for switches that can be converted to expressions. One of the motivations for refining break in this way is so that when fallthrough is needed, the existing idiom just works as everyone understands it to. > > My assumption is that that code can just keep doing what it's already doing. My claim is that there is only value to changing to expression switch if we are getting the benefit of how much more simple and constrained it is. > >> imho, early signs suggest that the grossness of `break x` is not nearly justified by the actual observed positive value of supporting multi-statement cases in expression switch. Are we open to killing that, or would we be if I produced more and clearer evidence? > > That's one valid interpretation of the data, but not the only. Whether making break behave more like return (takes a value in non-void context, doesn't take one in void context) is gross or natural is subjective. Here's my subjective interpretation: "great, most of the time, you don't have to use the escape hatch, but when you do, it has control flow just like the break we know, extended to take a value in non-void contexts, so will be fairly familiar." > > I think that there are features that make sense on their own, and there are features that totally make lots of sense assuming that you have heard the expert group's passionate explanation of why they make sense. (It reminds me of a certain Pied Piper focus group near the end of Silicon Valley season 2, but moving on.) I am concerned that "breaking a value" is of the second kind. > > > But setting aside subjective reactions, are there better alternatives? Let's review what has been considered already, and why they've been passed over: > > - Do nothing; only allow single expressions. Non-starter. > > We're just saying the feature seems to be at least 90% as applicable without it. Roughly. Why is it a non-starter for the other 10% to stick with the switch they've always had? I'm sure there are good answers to that, I'm not doubting there are, but I think we should explore them instead of just declaring something a non-starter by fiat. > > > case 1 -> 1; > case 2 -> { println("two"); break 2; } > > But that is pretty similar to what we have now, just with braces. If the concern is that we're stretching `break` too far, then this is just as bad. > > Worse, it has two significant additional downsides: > > 1. You can't fall through at all. (Yes, I know some people think this is an upside.) > > Yes! That! That's what we want. No fall-through. > > > But real code does use fallthrough, and this leaves them without any alternative; it also widens the asymmetry of expression switch vs statement switch. > > Well, the other thread I started today is me literally asking for asymmetry between this and statement switch. If we stop using the `switch` keyword, so much the better. > > What are the motivating use cases for fall-through in expression switch? These must be exclusively examples featuring side-effects, right? Or is there a way for a case to access the result produced by the previous one and build on it? > > > (Combine this with other suggestions that widen the asymmetry between pattern and non-pattern switch, and you have four switch constructs. Oops.) > > (Not familiar with that stuff yet.) > > > There might be other alternatives, but I don't see a better one, other than deprecating switch and designing a whole new mechanism. > > I'm confused. `switch` has worked the same way for 20+ years; what could possibly motivate us to deprecate it? > > > And, to defend what we've proposed: it's exactly the switch we all know, warts and all. Very little new; very little in the way of asymmetry between void/value and pattern/constant. > > (My response to this is already teed up in the other thread. Basically, it says that if we don't make expression switch suitably constrained then I have so far failed to grasp what its value is at all.) > > > The cost is that we have to accept the existing warts, primarily the weird block expression (blocks of statements with break not surrounded by braces), the weird scoping, and fallthrough. > > This choice reminds me of the old Yiddish proverb of the Tree of Sorrows. (https://www.inspirationalstories.com/0/69.html ). > > Tangent, but I think that story actually advocates that we stick with exactly the switch statement we already have today. > > > > If you've got something better ... > > > > On 3/13/2018 3:32 PM, Kevin Bourrillion wrote: >> On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman > wrote: >> >> Simplifying: let's call normal cases in a switch simple if they're a single statement or a no-op fallthrough, and let's call a default simple if it's a single statement or it's not there at all. >> >> Among switches apparently convertible to expression switches, >> 81% had all simple normal cases and a simple default. >> 5% had all simple normal cases and a nonsimple default. >> 12% had a nonsimple normal case and a simple default. >> 2% had a nonsimple normal case and a nonsimple default. >> I was surprised it was as high as 19%, so I grabbed a random sample of these 45 occurrences from Google's codebase and reviewed them. My goal was to find evidence that multi-statement cases in expression switches are important and common. Spoiler: I found said evidence underwhelming. >> >> There were 3 that I would call false matches (e.g. two that simply used a void `return` instead of `break` after every case without reason). >> >> There were fully 20 out of the remaining 42 that I quickly concluded should be refactored regardless of anything else, and where that refactoring happens to leave them with only simple cases and simple/no default. These refactorings were varied (hoist out code common to all non-exception cases; simplify unreachable code; change to `if` if only 1-2 cases; extract a method (needing only 1-2 parameters) for a case that is much bigger than the others; switch from loop to Streams; change `if/else` to ?:; move a precondition check to a more appropriate location; and a few other varied cleanups). >> >> Next there were 7 examples where the non-simple cases included side-effecting code, like setting fields or calling void methods. In Google Style I expect that we will probably forbid (or at least strongly dissuade) side effects in expression switch. I should probably bring this up separately, but I am pretty convinced by now that users should see expression switch and procedural switch as two completely different things, and by convention should always keep the former purely functional. >> >> Next there were 7 examples where a case was "non-simple" only because it was using the "log, then return a null object (or null), instead of throwing an exception" anti-pattern. I was surprised this was that popular. and another 2 that used the "log-and-also-throw" anti-pattern. >> >> 2 examples had a use-once local variable that saved a little bit of nesting. I wouldn't normally refactor these, but if expression switch had no mechanism for multi-statement cases, I wouldn't think twice about it. >> >> 1 example had cases that looked nearly identical, 3 statements each, that could all be hoisted out of the switch, except that the types that differed across the three didn't implement a common interface (as they clearly should have). Slightly compelling. >> >> 1 example had all simple cases except that one also wanted to check an assertion. Okay, slightly compelling. >> >> Finally, the cases that were the most compelling to me: 3 examples had one or more large cases, where factoring them out into helper methods would imho be ugly because >=3 parameters would be required. If expression switch didn't permit multi-statement cases, I would just keep them as procedural switches. It's only 3 out of 42. >> >> Summary: >> >> imho, early signs suggest that the grossness of `break x` is not nearly justified by the actual observed positive value of supporting multi-statement cases in expression switch. Are we open to killing that, or would we be if I produced more and clearer evidence? >> >> >> >> >> >> >> On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz > wrote: >> Did you happen to calculate what percentage was _not_ the "default" case? I would expect that to be a considerable fraction. >> >> On 3/9/2018 5:49 PM, Kevin Bourrillion wrote: >>> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax > wrote: >>> >>> When i asked what we should do instead, the answer is either: >>> 1/ we should not allow block of codes in the expression switch but only expression >>> 2/ that we should use the lambda syntax with return, even if the semantics is different from the lambda semantics. >>> >>> I do not like (1) because i think the expression switch will become useless >>> >>> In our (large) codebase, +Louis determined that, among switch statements that appear translatable to expression switch, 13.8% of them seem to require at least one multi-statement case. >>> >> >> >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 14:18:49 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 07:18:49 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: On Wed, Mar 14, 2018 at 5:14 AM, Victor Nazarov wrote: On Wed, Mar 14, 2018 at 4:38 AM, Brian Goetz wrote: > >> There are three arguments why the N case is significantly different from >> the 2 case. >> >> There are a number of idioms that require statements in addition to an >> expression. Debugging printfs, objects that take statements to initialize >> (construct/set/set/break), incrementing counters, cases that require side >> conditions (if today is tuesday do one thing, otherwise another), etc. >> Each individually is rare-ish, but not all that rare. >> >> The "static applicability" argument is that the larger the number of >> cases, the more likely one of them will fall into one these buckets, and >> then the whole thing has to fall back to statements. This makes >> expression switches less useful, and falling off this cliff is likely to >> irritate users every time it happens. >> >> The ?dynamic applicability? argument is that, if you want to change an >> existing switch (say, to add a debugging printf in one path), you have to >> refactor the whole thing. Which will be met, by users, with ?YGBFKM.? >> >> The ?cliff height? argument says that falling off the cliff on a two-way >> conditional and having to refactor to if-else is far less painful than >> falling off the cliff on an N-way switch. Its a more painful refactor. >> > > As someone who participated in bringing this topic back to attention I'd > like to say that I personally find these arguments substantial and > persuasive. I think it's enough to bring me peace considering current > switch-expression proposal (with restriction that we can't jump to a label > outside switch block). > Yes, this is a very substantive response to what makes expression switch different from `?:`. Thanks, Brian. I hope the point was still received that embedding statements inside an expression for immediate evaluation appears to be novel for Java. (All else being equal I assume that we are seeking to minimize novelty.) The only argument against left is > > I think that there are features that make sense on their own, and there are >> features that *totally make lots of sense* assuming that you have heard >> the >> expert group's passionate explanation of why they make sense. (It reminds >> me of a certain Pied Piper focus group near the end of Silicon Valley >> season 2, but moving on.) I am concerned that "breaking a value" is of the >> second kind." >> > > But this is not technical argument and I think it can't overweight > technical ones. > Oh, the quoted text doesn't constitute an "argument against" at all. It's attempting to be a reminder that "the existence of justifications that make sense to expert group type people" is not the same thing as "will fit nicely into the mental model of the 30th percentile Java developer". As this group debates language changes, I don't get a secure feeling that we are suitably concerned with what that mental model will be as opposed to what we think it *should* be. We don't get to actually *choose it; *users will pick it up from their own experiences interacting with the feature, and impressions they may soak up piecemeal from some page of a book, some slide of some presentation, (probably even more so) some muttered complaint of some colleague, etc. We can make only feeble attempts at best to nudge it directly. I am certain I'm not saying anything that everyone here does not already know. However, is it clear that this group is thinking in these terms regularly? A *lot* of what needs to be discussed is discussed at a level where you need to be a compiler or VM engineer to even understand it. I don't have the chops to even participate in most of these discussions. I would like to figure out how we can satisfy ourselves that we are properly accounting for the perspective of the "ordinary" programmer. Now, back to the quoted response, about the relative value of "non-technical arguments", I am not completely sure what it means. One way to read it which I doubt was intended is that all this "usability stuff" doesn't matter as much as the technical challenges to rev the spec, compiler, and VM. I hope that wasn't it. :-) -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 14:55:54 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 10:55:54 -0400 Subject: Record construction Message-ID: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> Let me summarize what we're proposing for record constructors.? I think we can simplify it so that it is easy to do parameter validation and normalization without falling off the syntactic cliff, and with fewer new features.? (As a bonus, I think these ideas mostly generalize to more concise non-record constructors too, but I'm going to leave that for another day -- but I'll just note that one of our goals for records is for them to be "just macros" for a set of finer-grained features, so that even if a class doesn't meet the requirements to be a record, it can still benefit from more concise construction, or equals/hashCode, or whatever.) Goals ----- it should be easy to add constructor parameter validation, like ??? Preconditions.require(x > 0); without significant repetition of record elements.? Similarly, it should be easy to normalize arguments before they are committed to fields: ??? if (name == null) ??????? name = ""; again without significant repetition.? If there are ancillary fields, it should be easy to initialize them in the constructor body without bringing back any boilerplate. Ideally, these mechanisms are consistent with construction idioms for non-record classes (records and classes should not be semantically different). Record Constructors ------------------- A record class: ??? record Point(int x, int y) { } has a "class signature" `(int x, int y)`.? Record classes always have a _default constructor_ whose signature matches the class signature; these can be implicit or explicit. The constructor syntax ??? record Point(int x, int y) { ??????? Point { ??????? } ??? } is proposed as a shorthand for an explicit default constructor: ??? record Point(int x, int y) { ??????? Point(int x, int y) { ??????? } ??? } The arguments A0..An are divided into super-arguments A0..Ak and this-arguments Ak+1..An.? The default super invocation is super(A0..Ak). If a default constructor does not contain an explicit super-call, an implicit super constructor call is provided at the start of the constructor, just as we do now with implicit no-arg super-constructors. An implicit field initialization, consisting of `this.xi = xi` for i in k+1..n, is added to the _end_ of the constructor. This makes both validation and normalization work; if the constructor body contains only: ??? Preconditions.require(x > 0); (this is just a library call), the code is executed after the super-call, and `x` refers to the arguments.? Just like in constructors today.? Similarly, normalization: ??? if (name == null) ??????? name = ""; is executed after the super-call, but before the field initialization, so any the update to the parameter "sticks", and `name` refers to the argument, not the field, just as today.? So existing idioms work, just by removing the boilerplate. We can protect against double-initialization either by (a) not allowing the user to explicitly initialize fields corresponding to Ak+1..An, or (b) if the corresponding field is definitely initialized by the explicit constructor body, leave it out of the implicit initialization code. If users want to adjust what gets passed to the super constructor, just use an explicit super-call.? If users want to adjust what gets written to the field, overwrite the parameter before it is written to the field.? Both of these are consistent with how constructors work today. If the record has extra fields (if allowed), the constructor can just initialize them: ??? Point { ??????? norm = Math.sqrt(x*x + y*y); ??? } Again, `x` and `y` here refer to constructor arguments, and everything works, with no repetition. Summary: ?- The required idioms work, just leaving out the boilerplate ?- Very similar to existing constructors ?- No need to support statements before super (though we can add this later, if we see fit) ?- No need for `default.this(...)` idiom *at all* (pause for agreement) From forax at univ-mlv.fr Wed Mar 14 15:04:37 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 14 Mar 2018 16:04:37 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <1154933572.1193584.1521039877190.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Kevin Bourrillion" > Cc: "Louis Wasserman" , "Remi Forax" , > "amber-spec-experts" > Envoy?: Mercredi 14 Mars 2018 02:38:21 > Objet: Re: break seen as a C archaism > There are three arguments why the N case is significantly different from the 2 > case. > There are a number of idioms that require statements in addition to an > expression. Debugging printfs, objects that take statements to initialize > (construct/set/set/break), incrementing counters, cases that require side > conditions (if today is tuesday do one thing, otherwise another), etc. Each > individually is rare-ish, but not all that rare. > The "static applicability" argument is that the larger the number of cases, the > more likely one of them will fall into one these buckets, and then the whole > thing has to fall back to statements. This makes expression switches less > useful, and falling off this cliff is likely to irritate users every time it > happens. As Kevin said in the other thread, instead of refactoring the whole switch, you can also add a helper method, after all, if all cases but one are expressions, it seems to be a good hint that the code of that case should be extracted to a new method. They are few switchs in the code (apart if you do parsing, see all the John's examples or ASM codes), usually in a program you can count them, so for usual code, the more cases you use, the less it's likely to appear. And i conjecture that it exists a n such after n cases all switch that exists are all in the generated codes. > The ?dynamic applicability? argument is that, if you want to change an existing > switch (say, to add a debugging printf in one path), you have to refactor the > whole thing. Which will be met, by users, with ?YGBFKM.? You mean if the case expression is enough complex that you want to print a part of it (otherwise you can printf the result of the switch), you have exactly the same issue with any complex expressions in Java, again, make the code simpler, create an helper method. > The ?cliff height? argument says that falling off the cliff on a two-way > conditional and having to refactor to if-else is far less painful than falling > off the cliff on an N-way switch. Its a more painful refactor. The "cliff height" argument was used by Josh Bosch when choosing if 'this' in a lambda should mean the lambda itself or the enclosing instance. So yes, sometimes, we will have to do a refactoring, IDEs will help, but the cliff height is only an argument if you have to jump over it often. > So for all these reasons, not being able to occasionally include some statements > means many more switches that can?t use the feature (which is safer, clearer, > and more compact), and also more often that the user will have to gratuitously > refactor perfectly good code as they make small changes. The 90% argument of Kevin is in my opinion stronger than these arguments. R?mi >> On Mar 13, 2018, at 8:59 PM, Kevin Bourrillion < [ mailto:kevinb at google.com | >> kevinb at google.com ] > wrote: >>>> - Do nothing; only allow single expressions. Non-starter. >>> We're just saying the feature seems to be at least 90% as applicable without it. >>> Roughly. Why is it a non-starter for the other 10% to stick with the switch >>> they've always had? I'm sure there are good answers to that, I'm not doubting >>> there are, but I think we should explore them instead of just declaring >>> something a non-starter by fiat. >> Also, if it is true that this is a "non-starter", I would assume it is also a >> non-starter to only allow single expressions in the conditional operator `?:`. >> If not, what is the fundamental difference? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 15:14:54 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 11:14:54 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: > I hope the point was still received that embedding statements inside > an expression for immediate evaluation appears to be novel for Java. > (All else being equal I assume that we are seeking to minimize novelty.) Yes, we're seeking to minimize unnecessary novelty.? Winning is making things look like they were there all along.? (Sometimes we do better on this score than others; I don't hold out a lot of hope for "break n" being immediately recognized as something that was always lurking under the surface, especially given how much people hate break already, but I think we'll do pretty well on this score for integrating patterns in switch.) I think a sensible next step would be for me to summarize the path by which we got here.? I'll try to write that up in the next few days. In the meantime, let me probe for what's really uncomfortable about the current design point.? Is it: ?- That there are two ways to yield a value, -> e and "break e", and users won't be able to keep them straight; ?- The idea of using a statement at all to yield a value from an expression seems too roundabout; ?- That we are overloading an existing control construct, "break", to mean something just different enough to be uncomfortable; ?- Something else? -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 15:55:24 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 08:55:24 -0700 Subject: expression switch vs. procedural switch In-Reply-To: References: Message-ID: On Tue, Mar 13, 2018 at 1:02 PM, Kevin Bourrillion wrote: The more I have thought about it, the more I believe that 95% of the entire > value of expression switch is that it *isn't procedural switch*, and is > easier to reason about than procedural switch because of all things it *can't > *do: > > - can't miss cases > - can't return > - can't break/continue a containing construct > - can't fall through > - (for constants or other disjoint patterns) can't depend on the order > of cases. > > As far as I can tell, its limitations are exactly what make it useful. > Brian reminded me in the other thread that as long as we voluntarily stick to `->` style for all cases, we get all of this. So, from my perspective, if we just adopt a style rule for Google Style that when using switch in an expression context one should stick to `->`, I might have basically what I want. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 16:05:55 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 12:05:55 -0400 Subject: expression switch vs. procedural switch In-Reply-To: References: Message-ID: <9f3f814e-c5a3-e91c-8c95-1bd0fac2bb3f@oracle.com> > Brian reminded me in the other thread that as long as we voluntarily > stick to `->` style for all cases, we get all of this. So, from my > perspective, if we just adopt a style rule for Google Style that when > using switch in an expression context one should stick to `->`, I > might have basically what I want. I hope that's true; that was certainly the intent of the -> convention, which was intended to be the "safe and happy" place without undermining what switch really is or foreclosing on the other options when needed.? Some additional comments inline. > > The more I have thought about it, the more I believe that 95% of > the entire value of expression switch is that it /isn't procedural > switch/,?and is easier to reason about than procedural switch > because of all things it /can't /do: > > * can't miss cases > As John pointed out, some sort of help for exhaustiveness checking for _statement_ switches would be nice too.? We can't do this by default, because it would change the meaning of existing code, but it would be nice to be able to enlist the compiler's support on exhaustiveness.? This turns out to be harder than it looks, but I'll write some notes in a separate thread. > * can't return > * can't break/continue a containing construct > Even if you cross the line from -> to :, you still get these guardrails in expression switches. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 14 16:06:51 2018 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Mar 2018 09:06:51 -0700 Subject: expression switch vs. procedural switch In-Reply-To: References: Message-ID: <1A39FE03-B91E-4673-91BC-37B14387C1EB@oracle.com> On Mar 14, 2018, at 8:55 AM, Kevin Bourrillion wrote: > > Brian reminded me in the other thread that as long as we voluntarily stick to `->` style for all cases, we get all of this. So, from my perspective, if we just adopt a style rule for Google Style that when using switch in an expression context one should stick to `->`, I might have basically what I want. I agree it makes sense to aim for this as an "extra clean" notation, not a separate design but a subset of the whole design. It's a motivator for the design of multiple labels for one switch chunk (the nice kind of fallthrough). So "case 1,2,3" syntax is a tweak for "case 1: case 2: case 3", allowing either colon or arrow after the third case label. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Mar 14 16:16:29 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 14 Mar 2018 17:16:29 +0100 (CET) Subject: expression switch vs. procedural switch In-Reply-To: References: Message-ID: <332773002.1241075.1521044189363.JavaMail.zimbra@u-pem.fr> > De: "Kevin Bourrillion" > ?: "amber-spec-experts" > Envoy?: Mercredi 14 Mars 2018 16:55:24 > Objet: Re: expression switch vs. procedural switch > On Tue, Mar 13, 2018 at 1:02 PM, Kevin Bourrillion < [ mailto:kevinb at google.com > | kevinb at google.com ] > wrote: >> The more I have thought about it, the more I believe that 95% of the entire >> value of expression switch is that it isn't procedural switch , and is easier >> to reason about than procedural switch because of all things it can't do: >> * can't miss cases >> * can't return >> * can't break/continue a containing construct >> * can't fall through >> * (for constants or other disjoint patterns) can't depend on the order of cases. >> As far as I can tell, its limitations are exactly what make it useful. > Brian reminded me in the other thread that as long as we voluntarily stick to > `->` style for all cases, we get all of this. So, from my perspective, if we > just adopt a style rule for Google Style that when using switch in an > expression context one should stick to `->`, I might have basically what I > want. yes, but it's what i detest the most about C++, everyone has its own dialect. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Mar 14 16:12:34 2018 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 14 Mar 2018 12:12:34 -0400 Subject: expression switch vs. procedural switch In-Reply-To: <332773002.1241075.1521044189363.JavaMail.zimbra@u-pem.fr> References: <332773002.1241075.1521044189363.JavaMail.zimbra@u-pem.fr> Message-ID: <22C70FD8-6C75-4811-B700-C766BBD42457@oracle.com> > On Mar 14, 2018, at 12:16 PM, Remi Forax wrote: > > > De: "Kevin Bourrillion" > ?: "amber-spec-experts" > Envoy?: Mercredi 14 Mars 2018 16:55:24 > Objet: Re: expression switch vs. procedural switch > On Tue, Mar 13, 2018 at 1:02 PM, Kevin Bourrillion > wrote: > > The more I have thought about it, the more I believe that 95% of the entire value of expression switch is that it isn't procedural switch, and is easier to reason about than procedural switch because of all things it can't do: > can't miss cases > can't return > can't break/continue a containing construct > can't fall through > (for constants or other disjoint patterns) can't depend on the order of cases. > As far as I can tell, its limitations are exactly what make it useful. > > Brian reminded me in the other thread that as long as we voluntarily stick to `->` style for all cases, we get all of this. So, from my perspective, if we just adopt a style rule for Google Style that when using switch in an expression context one should stick to `->`, I might have basically what I want. > > yes, but it's what i detest the most about C++, everyone has its own dialect. What is the solution? A style requirement that every programmer use every feature in the language at least once in any program? (I have known programmers like that, and their code was not necessarily any easier to read.) I am sympathetic to your feeling about this, but have no idea how to encourage it or enforce it. You really can?t prevent a programmer, or group of programmers, from sticking to a subset that makes them happy. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Mar 14 16:06:52 2018 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 14 Mar 2018 12:06:52 -0400 Subject: Record construction In-Reply-To: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> References: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> Message-ID: <97720F7A-75B1-49E4-B68D-1D83EF042490@oracle.com> This all looks great to me, except see one comment below. ?Guy > On Mar 14, 2018, at 10:55 AM, Brian Goetz wrote: > > Let me summarize what we're proposing for record constructors. > . . . > > We can protect against double-initialization either by (a) not allowing the user to explicitly initialize fields corresponding to Ak+1..An, or (b) if the corresponding field is definitely initialized by the explicit constructor body, leave it out of the implicit initialization code. If we go with choice (b), I would recommend that it be amended to read: (b) If the corresponding field is definitely initialized by the explicit constructor body, leave it out of the implicit initialization code; if the corresponding field is definitely not initialized by the explicit constructor body, put it in the implicit initialization code; and in all other cases it is a compile-time error. This is to defend against code such as record Point(int x, int y) { Point { if (x > 0) this.y = x; } } where if you put in the implicit ?this.y = y;? then the user?s code "if (x > 0) this.y = x;? has no effect, and if you don?t put it in then this.y might not be initialized at all. The user really should have written record Point(int x, int y) { Point { if (x > 0) y = x; } } and the compile-time error might be enough to wake him up to this fact. From brian.goetz at oracle.com Wed Mar 14 16:28:57 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 12:28:57 -0400 Subject: Record construction In-Reply-To: <97720F7A-75B1-49E4-B68D-1D83EF042490@oracle.com> References: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> <97720F7A-75B1-49E4-B68D-1D83EF042490@oracle.com> Message-ID: > If we go with choice (b), I would recommend that it be amended to read: > > (b) If the corresponding field is definitely initialized by the explicit constructor body, leave it out of the implicit initialization code; > if the corresponding field is definitely not initialized by the explicit constructor body, put it in the implicit initialization code; > and in all other cases it is a compile-time error. > > This is to defend against code such as > > record Point(int x, int y) { > Point { > if (x > 0) this.y = x; > } > } Yes, that was the intent -- thanks for making this explicit.? So DA -- OK, DU -- OK, but neither -- not OK. From forax at univ-mlv.fr Wed Mar 14 16:45:51 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 14 Mar 2018 17:45:51 +0100 (CET) Subject: expression switch vs. procedural switch In-Reply-To: <22C70FD8-6C75-4811-B700-C766BBD42457@oracle.com> References: <332773002.1241075.1521044189363.JavaMail.zimbra@u-pem.fr> <22C70FD8-6C75-4811-B700-C766BBD42457@oracle.com> Message-ID: <1474043504.1254967.1521045951434.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Kevin Bourrillion" , "amber-spec-experts" > > Envoy?: Mercredi 14 Mars 2018 17:12:34 > Objet: Re: expression switch vs. procedural switch >> On Mar 14, 2018, at 12:16 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | >> forax at univ-mlv.fr ] > wrote: >>> De: "Kevin Bourrillion" < [ mailto:kevinb at google.com | kevinb at google.com ] > >>> ?: "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Envoy?: Mercredi 14 Mars 2018 16:55:24 >>> Objet: Re: expression switch vs. procedural switch >>> On Tue, Mar 13, 2018 at 1:02 PM, Kevin Bourrillion < [ mailto:kevinb at google.com >>> | kevinb at google.com ] > wrote: >>>> The more I have thought about it, the more I believe that 95% of the entire >>>> value of expression switch is that it isn't procedural switch , and is easier >>>> to reason about than procedural switch because of all things it can't do: >>>> * can't miss cases >>>> * can't return >>>> * can't break/continue a containing construct >>>> * can't fall through >>>> * (for constants or other disjoint patterns) can't depend on the order of cases. >>>> As far as I can tell, its limitations are exactly what make it useful. >>> Brian reminded me in the other thread that as long as we voluntarily stick to >>> `->` style for all cases, we get all of this. So, from my perspective, if we >>> just adopt a style rule for Google Style that when using switch in an >>> expression context one should stick to `->`, I might have basically what I >>> want. >> yes, but it's what i detest the most about C++, everyone has its own dialect. > What is the solution? A style requirement that every programmer use every > feature in the language at least once in any program? (I have known programmers > like that, and their code was not necessarily any easier to read.) Do not introduce a feature in the language which is used once every year is a good start. Do not add a solution to solve the corner^2 case (the corner case of a corner case as Brian call it) in the language. > I am sympathetic to your feeling about this, but have no idea how to encourage > it or enforce it. You really can?t prevent a programmer, or group of > programmers, from sticking to a subset that makes them happy. on the Human aspect of programming, publish an official language guideline and provides tools that enforce it like Google does with Java or golang (with go-fmt). > ?Guy R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Wed Mar 14 16:48:22 2018 From: mark at io7m.com (Mark Raynsford) Date: Wed, 14 Mar 2018 16:48:22 +0000 Subject: Record construction In-Reply-To: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> References: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> Message-ID: <20180314164822.026b81b6@copperhead.int.arc7.info> On 2018-03-14T10:55:54 -0400 Brian Goetz wrote: > > The constructor syntax > > ??? record Point(int x, int y) { > ??????? Point { > ??????? } > ??? } > > is proposed as a shorthand for an explicit default constructor: > > ??? record Point(int x, int y) { > ??????? Point(int x, int y) { > ??????? } > ??? } > One small thing: Could the contents of this constructor be lifted into JavaDoc? The fact that preconditions are supposed to appear in documentation is something that seems to be sadly lacking in almost all of the "design by contract" systems for Java. If I write a constructor like: record Point(int x, int y) { ?? Point { ?? Preconditions.mustBeNonNegative(x); Preconditions.mustBeNonNegative(y);? } } It'd be nice if those statements could be made visible in the JavaDoc. I think this was covered slightly in some of the other discussion about "requires", but it fizzled out (unless I missed something). -- Mark Raynsford | http://www.io7m.com From brian.goetz at oracle.com Wed Mar 14 16:58:59 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 12:58:59 -0400 Subject: Patterns and nulls Message-ID: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> In the message "More on patterns, generics, null, and primitives", Gavin outlines how these constructs will be treated in pattern matching.? This mail is a refinement of that, specifically, to refine how nulls are treated. Rambling Background Of Why This Is A Problem At All --------------------------------------------------- Nulls will always be a source of corner cases and surprises, so the best we can likely do is move the surprises around to coincide with existing surprise modes.? One of the existing surprise modes is that switches on reference types (boxes, strings, and enums) currently always NPE when passed a null.? You could characterize switch's current treatment of null as "La la la can't hear you la la la."? (I think this decision was mostly made by frog-boiling; in Java 1.0, there were no switches on reference types, so it was not an issue; when switches on boxes was added, it was done by appeal to auto-unboxing, which throws on null, and null enums are rare enough that no one felt it was important enough to do something different for them.? Then when we added string switch in 7, we were already mostly sliding the slippery slope of past precedent.) The "la la la" approach has gotten us pretty far, but I think finally runs out of gas when we have nested patterns.? It might be OK to NPE when x = null here: ??? switch (x) { ??????? case String: ... ??????? case Integer: ... ??????? default: ... ??? } but it is certainly not OK to NPE when b = new Box(null): ??? switch (b) { ??????? case Box(String s): ... ??????? case Box(Integer i): ... ??????? case Box(Object o): ... ??? } since `Box(null)` is a perfectly reasonable box.? (Which of these patterns matches `Box(null)` is a different story, see below.)? So problem #1 with is that we need a way to match nulls in nested patterns; having nested patterns throw whenever any intermediate binding produces null would be crazy.? So, we have to deal with nulls in this way.? It seems natural, therefore, to be able to confront it directly: ??? case Box(null): ... which is just an ordinary nested pattern, where our target matches `Box(var x)` and further x matches null.? Which means `x matches null` need to be a thing, even if switch is hostile to nulls. But if you pull on this string a bit more, we'd also like to do the same at the top level, because we'd like to be able to refactor ??? switch (b) { ??????? case Box(null): ... ??????? case Box(Candy): ... ??????? case Box(Object): ... ??? } into ??? switch (b) { ??????? case Box(var x): ? ? ? ?? ?? switch (x) { ? ? ? ?? ?????? case null: ... case Candy: ... case Object: ... ??????????? } ??? } with no subtle semantics changes.? I think this is what users will expect, and cutting them on sharp edges here wouldn't be doing them favors. Null and Type Patterns ---------------------- The previous iteration outlined in Gavin's mail was motivated by a sensible goal, but I think we took it a little too literally. Which is that if I have a `Box(null)`, it should match the following: ??? case Box(var x): because it would be weird if `var x` in a nested context really meant "everything but null."? This led us to the position that ??? case Box(Object o): should also match `Box(null)`, because `var` is just type inference, and the compiler infers `Object` here from the signature of the `Box` deconstructor.? So `var` and the type that gets inferred should be treated the same.? (Note that Scala departs from this, and the results are pretty confusing.) You might convince yourself that `Box(Object)` not matching `Box(null)` is not a problem, just add a case to handle null, with an OR pattern (aka non-harmful fallthrough): ??? case Box(null): // fall through ??? case Box(Object): ... But, this only works in the simple case.? What if my Box deconstructor had four binding variables: ??? case Box(P, Q, R, S): Now, to capture the same semantics, you need four more cases: ??? case Box(null, Q, R, S): // fall through ??? case Box(P, null, R, S):// fall through ??? case Box(P, Q, null, S): // fall through ??? case Box(P, Q, R, null): // fall through ??? case Box(P, Q, R, S): But wait, it gets worse, since if P and friends have binding variables, and the null pattern does not, the binding variables will not be DA and therefore not be usable.? And if we graft binding variables onto constant patterns, we have a potential typing problem, since the type of merged binding variables in OR patterns should match.? So this is a tire fire, let's back away slowly. So, we want at least some type patterns to match null, at least in nested contexts.? Got it. This led us to: a type pattern `T t` should match null.? But clearly, in the switch ??? switch (aString) { ??????? case String s: ... ??? } it NPEs (since that's what it does today.)? So we moved the null hostility to `switch`, which involved an analysis of whether `case null` was present.? As Kevin pointed out, that was pretty confusing for the users to keep track of.? So that's not so good. Also not so good: if type patterns match null, then the dominance order rule says you can't put a `case null` arm after a type pattern arm, because the `case null` will be dead.? (Just like you can't catch `IOException` after catching `Throwable`.)? Which deprived case null of most of its remaining usefulness, which is: lump null in with the default.? If users want to use `case null`, they most likely want this: ??? switch (o) { ??????? case A: ... ??????? case B: ... ??????? case null: // fall through ??????? default: ??????????? // deal with unexpected values ??? } If we can't do that -- which the latest iteration said we can't -- its pretty useless.? So, we got something wrong with type patterns too.? Tricky buggers, these nulls! Some Problems With the Current Plan ----------------------------------- The current plan, even though it came via a sensible path, has lots of problems.? Including: ?- Its hard to reason about which switches throw on null and which don't.? (This will never be easy, but we can make it less hard.) ?- We have asymmetries between nested and non-nested patterns; if we unroll a nested pattern to a nested switch, the semantics shift subtly out from under us. ?- There's no way to say "default including null", which is what people would actually want to do if they had explicit control over nulls.? Having `String s` match null means our ordering rules force the null case too early, depriving us of the ability to lump it in with another case. Further, while the intent of `Box(var x)` matches `Box(null)` was right, and that led us to `Box(Object)` matches `Box(null)`, we didn't pull this string to the end.? So let's break some assumptions and start over. Let's assume we have the following declarations: ??? record Box(Object); ??? Object o; ??? String s; ??? Box b; Implicitly, `Box` has a deconstruction pattern whose signature is `Box(out Object o)`. What will users expect on the following? ??? Box b = new Box(null); ??? switch (b) { ??????? case Box(Candy x): ... ??????? case Box(Frog f): ... ??????? case Box(Object o): ... ??? } There are four non-ridiculous possibilities: ?- NPE ?- Match none ?- Match Box(Candy) ?- Match Box(Object) I argued above why NPE is undesirable; I think matching none of them would also be pretty surprising, since `Box(null)` is a perfectly reasonable element of the value set decribed by the pattern `Box(Object)`.? If all type patterns match null, we'd match `Box(Candy)` -- but that's pretty weird and arbitrary, and probably not what the user expects.? It also means -- and this is a serious smell -- that we couldn't freely reorder the independent cases `Box(Candy)` and `Box(Frog)` without subtly altering behavior.? Yuck! So the only reasonable outcome is that it matches `Box(Object)`.? We'll need a credible theory why we bypass the candy and the frog buckets, but I think this is what the user will expect -- `Box(Object)` is our catch-all bucket. A Credible Theory ----------------- Recall that matching a nested pattern `x matches Box(P)` means: ??? x matches Box(var alpha) && alpha matches P The theory by which we can reasonably claim that `Box(Object)` matches `Box(null)` is that the nested pattern `Object` is _total_ on the type of its target (alpha), and therefore can be statically deemed to match without additional dynamic checks.? In ??????? case Box(Candy x): ... ??????? case Box(Frog f): ... ??????? case Box(Object o): ... the first two cases require additional dynamic type tests (instanceof Candy / Frog), but the latter, if the target is a `Box` at all, requires no further dynamic testing.? So we can _define_ `T t` to mean: ??? match(T t, e : U) === U <: T ? true : e instanceof U In other words, a total type pattern matches null, but a partial type pattern does not.? That's great for the type system weenies, but does it help the users?? I claim it does. It means that in: ??? Box b = new Box(null); ??? switch (b) { ??????? case Box(Candy x): ... ??????? case Box(Frog f): ... ??????? case Box(Object o): ... ??? } We match `Box(Object)`, which is the catch-all `Box` handler. We can freely reorder the first two cases, because they're unordered by dominance, but we can't reorder either of them with `Box(Object)`, because that would create a dead case arm.? `Box(var x)` and `Box(T x)` mean the same thing when `T` is the type that inference produces. So `Box(Candy)` selects all boxes known to contain candy; `Box(Frog)` all boxes known to contain frogs; `Box(null)` selects a box containing null, and `Box(_)` or `Box(var x)` or `Box(Object o)` selects all boxes. Further, we can unroll the above to: ??? Box b = new Box(null); ??? switch (b) { ??????? case Box(var x): switch (x) { case Candy c: ... case Frog f: ... case Object o: ... ??????????? } ??? } and it means _the same thing_; the nulls flow into the `Object` catch basin, and I can still freely recorder the Candy/Frog cases. Whew. This feels like we're getting somewhere. We can also now flow the `case null` down to where it falls through into the "everything else" bucket, because type patterns no longer match nulls.? If specified at all, this is probably where the user most wants to put it. Note also that the notion of a "total pattern" (one whose applicability, possibly modulo null, can be determined statically) comes up elsewhere too.? We talked about a let-bind statement: ?? let Point(var x, var y) = p In order for the compiler to know that an `else` is not required on a let-bind, the pattern has to be total on the static type of the target.? So this notion of totality is a useful one. Where totality starts to feel uncomfortable is the fact that while null _matches_ `Object o`, it is not `instanceof Object`.? More on this later. This addresses all the problems we stated above, so what's the problem? Default becomes legacy ---------------------- The catch is that the irregularity of `default` becomes even more problematic.? The cure is we give `default` a gold watch, thank it for its services, and grant it "Keyword Emeritus" status. What's wrong with default?? First, it's syntactically irregular.? It's not a pattern, so doesn't easily admit nesting or binding variables.? And second, its semantically irregular; it means "everything else (but not null!)"? Which makes it a poor catch-all.? We'd like for our catch-all case -- the one that dominates all other possible cases -- to catch everything.? We thought we wanted `default` to be equivalent to a total pattern, but default is insufficiently total. So, let's define a _constant switch_ as one whose target is the existing constant types (primitives, their boxes, strings, and enums) and whose labels are all constants (the latter condition might not be needed).? In a constant switch, retcon default to mean "all the constants I've not explicitly enumerated, except null."? (If you want to flow nulls into the default bin too, just add an explicit `case null` to fall into default, _or_ replace `default` with a total pattern.)? We act as if that constant switches have an implicit "case null: NPE" _at the bottom_.? If you don't handle null explicitly (a total pattern counts as handling it explicitly), you fall into that bucket. Then, we _ban_ default in non-constant switches.? So if you want patterns, swap your old deficient `default` for new shiny total patterns, which are a better default, and are truly exhaustive (rather than modulo-null exhaustive).? If we can do a little more to express the intention of exhaustiveness for statement switches (which are not required to be exhaustive), this gives us a path to "switches never throw NPE if you follow XYZ rules." There's more work to do here to get to this statically-provable null-safe switch future, but I think this is a very positive direction.? (Of course, we can't prevent NPEs from people matching against `Object o` and then dereferencing o.) Instanceof becomes instanceof ----------------------------- The other catch is that we can't use `instanceof` to be the spelling of our `matches` operator, because it conflicts with existing `instanceof` treatment of nulls.? I think that's OK; `instanceof` is a low-level primitive; matching is a high-level construct defined partially in terms of instanceof. From brian.goetz at oracle.com Wed Mar 14 17:10:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 13:10:16 -0400 Subject: Record construction In-Reply-To: <20180314164822.026b81b6@copperhead.int.arc7.info> References: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> <20180314164822.026b81b6@copperhead.int.arc7.info> Message-ID: <753173d2-aed1-f256-c5ab-0aa294b6af4e@oracle.com> > One small thing: Could the contents of this constructor be lifted into > JavaDoc? The contents of the constructor under this model is surely less burdened by boilerplate, but will still have implementation details that would make for some strange specification reading.? So its not quite liftable directly.? But the question is still a fair one; how do we better document preconditions, without requiring the user say everything twice? One possibility, as discussed, is to attach these as their own thing: "require x > 0", which is code that is narrowly targeted enough to lift. Another is to raise the Preconditions library to more than just a library; let the Javadoc tool have some special treatment for, say, Precondition calls that unconditionally executed in the constructor, where they are boiled down into a documentary form.? This has the usual problem of trying to distill intent from imperative code, but perhaps could be sufficiently constrained to work well enough for 95% of preconditions. Kevin and I have been chatting offline about how some of the accidental problems with Preconditions can be improved with the compiler magic that was described in my recent document on JEP 303. I think that if we can clear some of these accidental hurdles, a standard Precondition library could be more practical, which might pave the way for some of this.? (I'll let Kevin explain further.) Either way, though, I think better support for DBC is an orthogonal feature; I don't think it significantly constrains our ability to deliver a records feature without it. From kevinb at google.com Wed Mar 14 17:14:12 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 10:14:12 -0700 Subject: Preconditions (for records, or otherwise) In-Reply-To: References: Message-ID: This thread was, at first, discussing both records-specific and general approaches; the records-specific part is now being addressed in other threads, but I think the following has not yet been engaged with: On Fri, Mar 9, 2018 at 1:56 PM, Kevin Bourrillion wrote: But I don't want to give up too easily on a more *general* approach that > would apply to records, methods, and constructors. That's been sketched at > times as > > void foo(int num, String unrelated) > requires (num >= 0) { > ... > } > > where `requires` takes a boolean expression*, which lives in the same > scope as the body proper; if it evaluates to false, an exception is thrown > and the body is never entered. > > The main criticism I hear about this is that it feels like a *"method > with two bodies"*. To that I'd point out that > > - it is only an *expression* -- and anything even moderately complex > ought to be factored out, just like we advise for lambdas > - this expression isn't implementation; it's contract, so frankly it *belongs > *in this elevated place more than it does in the body. It is > information that pertains, not really to the body, but to the communication > between caller and body - just like the signature does. > - this way, the preconditions can be *inherited* by default in an > overriding method, which seems awfully convenient to me right now. (If you > have some conditions you wouldn't want inherited for some reason, keep > those in the regular body. I'm not sure whether these are *technically* LSP > violations, but in pragmatic terms they don't seem to be, to me) > > I bring all this up because some of the upsides seem quite compelling to > me: > > - The automatically composed exception *message* will be more useful > than what 90% of users bother to string together (and the other 10% are > wasting time and space dealing with it). > - These expressions can be displayed in generated *documentation* so > you don't have to write them out a second time in prose. > - I admit this may feel weird for a core language feature, but you can > choose the idiomatic exception *type* automatically: if the expression > involved at least one parameter, it's IAE; otherwise it's probably ISE > (except in the amusing case of `requires (false)` it is UOE). (Again, maybe > this is too weird.) > - Some of these expressions are *verifiable* statically. For example a > call to `foo(-1, "x")` (using example above) should be caught by javac. I > suppose we teach it to recognize cases like empty collections through > compiler plugins. > > Note that the other design-by-contract idioms are still addressed well > enough by `assert`; we only need this one because `assert` disclaims this > use case (for good reason). > > (*why I say it should take one boolean expression, not a comma-separated > list: I think we might as well let the user choose between short-circuiting > or not, by using && and & directly, which makes it clear to readers as > well. Well, that is, charitably assuming that reader remembers the > difference.) > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 18:04:27 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 11:04:27 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: On Wed, Mar 14, 2018 at 8:14 AM, Brian Goetz wrote: In the meantime, let me probe for what's really uncomfortable about the > current design point. Is it: > - That there are two ways to yield a value, -> e and "break e", and users > won't be able to keep them straight; > Nope. > - The idea of using a statement at all to yield a value from an > expression seems too roundabout; > Not really. - That we are overloading an existing control construct, "break", to mean > something just different enough to be uncomfortable; > To some degree yes, since `break ` already means something. > - Something else? > Part of it is the ability to embed a number of statements inside an expression (please, at least require curly braces, but still). Part of it is that I know how to make sense of (a) current switch and (b) a simple well-behaved nice expression switch that only uses `->`, but knowing that I may have to deal with (c) code that is some mixture between the two feels like additional level of complexity to me. Even if from an implementation standpoint it's not. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Mar 14 17:57:06 2018 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 14 Mar 2018 13:57:06 -0400 Subject: expression switch vs. procedural switch In-Reply-To: <1474043504.1254967.1521045951434.JavaMail.zimbra@u-pem.fr> References: <332773002.1241075.1521044189363.JavaMail.zimbra@u-pem.fr> <22C70FD8-6C75-4811-B700-C766BBD42457@oracle.com> <1474043504.1254967.1521045951434.JavaMail.zimbra@u-pem.fr> Message-ID: <36E09E6E-DEAD-449F-890E-C92373973172@oracle.com> > On Mar 14, 2018, at 12:45 PM, forax at univ-mlv.fr wrote: > > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Kevin Bourrillion" , "amber-spec-experts" > Envoy?: Mercredi 14 Mars 2018 17:12:34 > Objet: Re: expression switch vs. procedural switch > On Mar 14, 2018, at 12:16 PM, Remi Forax > wrote: > . . . > yes, but it's what i detest the most about C++, everyone has its own dialect. > > What is the solution? A style requirement that every programmer use every feature in the language at least once in any program? (I have known programmers like that, and their code was not necessarily any easier to read.) > > Do not introduce a feature in the language which is used once every year is a good start. > Do not add a solution to solve the corner^2 case (the corner case of a corner case as Brian call it) in the language. These are good answers to my question, thanks! > > I am sympathetic to your feeling about this, but have no idea how to encourage it or enforce it. You really can?t prevent a programmer, or group of programmers, from sticking to a subset that makes them happy. > > on the Human aspect of programming, publish an official language guideline and provides tools that enforce it like Google does with Java or golang (with go-fmt). I agree that common guidelines are a good thing. But you still can?t prevent programmers from choosing their own ?happy subsets? of even the official guideline. Simple example: suppose I choose, as my own special discipline (which I have sometimes used) never to use `break` to break out of a `for` or `while` loop, but instead insist on providing a label and using `break label;`. The rationale is that whenever I see a plain `break` in my code, I always know it?s for a `switch` statement. If you agree with this idea, then I win: I get to use my happy subset. If you disagree, then I really win: it demonstrates that you and I would choose different happy subsets! :-) :-) :-) (Don?t mind me; I?m feeling puckish today.) ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 14 18:14:00 2018 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Mar 2018 11:14:00 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <99B4E2B7-7D7B-4FB1-83B7-FD4B42A950BB@oracle.com> On Mar 14, 2018, at 11:04 AM, Kevin Bourrillion wrote: > > Part of it is the ability to embed a number of statements inside an expression (please, at least require curly braces, but still). Well, we do, look again! The switch as a whole requires the curlies you want. I'm not being completely flip here. If we feel a need for for more curlies inside the existing curlies, I think we may be at risk for this syndrome: Making new language features #{ Stand Out } http://gafter.blogspot.com/2017/06/making-new-language-features-stand-out.html ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Mar 14 18:17:38 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 14 Mar 2018 11:17:38 -0700 Subject: break seen as a C archaism In-Reply-To: <99B4E2B7-7D7B-4FB1-83B7-FD4B42A950BB@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <99B4E2B7-7D7B-4FB1-83B7-FD4B42A950BB@oracle.com> Message-ID: Okay that was a derp on my part, thanks. On Wed, Mar 14, 2018 at 11:14 AM, John Rose wrote: > On Mar 14, 2018, at 11:04 AM, Kevin Bourrillion wrote: > > > Part of it is the ability to embed a number of statements inside an > expression (please, at least require curly braces, but still). > > > Well, we do, look again! The switch as a whole requires the curlies you > want. > > I'm not being completely flip here. If we feel a need for for more > curlies inside > the existing curlies, I think we may be at risk for this syndrome: > > Making new language features #{ Stand Out } > http://gafter.blogspot.com/2017/06/making-new-language- > features-stand-out.html > > ? John > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 14 18:32:15 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 14 Mar 2018 14:32:15 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <99932769-0f5e-164d-8900-51b9ed8f49b4@oracle.com> > ?- That we are overloading an existing control construct, "break", > to mean something just different enough to be uncomfortable; > > > To some degree yes, since `break ` already means something. Digging deeper: If we spelled "break " differently (yield, emit, defuse), would it be significantly different?? I think reusing "return" is worse than reusing "break", but there are other choices.? (Though introducing a new keyword has its own user-model challenges.) > ?Part of it is that I know how to make sense of (a) current switch and > (b) a simple well-behaved nice expression switch that only uses `->`, > but knowing that I may have to deal with (c) code that is some mixture > between the two feels like additional level of complexity to me. Even > if from an implementation standpoint it's not. I like to think that this is pedagogical, stemming from thinking of switch expressions and switch statements as unrelated things.? If we view expression switches as a generalization of existing switch, I think that the dichotomy between A/B can go away.? But only if there is a clear enough explanation that everyone will eventually receive. C is still an issue, and I do get the discomfort of mixing both -> and : cases, and I agree that good style will minimize mixing.? Outlawing mixing entirely isn't a great answer, though; its too common to use -> for all the cases except default, which often needs statements to do its thing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Wed Mar 14 18:32:47 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 14 Mar 2018 14:32:47 -0400 Subject: expression switch vs. procedural switch In-Reply-To: <9f3f814e-c5a3-e91c-8c95-1bd0fac2bb3f@oracle.com> References: <9f3f814e-c5a3-e91c-8c95-1bd0fac2bb3f@oracle.com> Message-ID: <06d7bae4-83a9-b9c5-406c-ca33cabafc50@cs.oswego.edu> In case anyone is curious, I've been following this without much to add to discussion: Ever since seeing Brian's initial proposal (which basically remains intact), I haven't thought of or seen anything better. I think already I hold the record for density of SSA-like constructions in Java, and am looking forward to beating my own record using expression switches even if they (rarely) need weird breaky constructions. -Doug From john.r.rose at oracle.com Wed Mar 14 18:45:46 2018 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Mar 2018 11:45:46 -0700 Subject: expression switch vs. procedural switch In-Reply-To: <06d7bae4-83a9-b9c5-406c-ca33cabafc50@cs.oswego.edu> References: <9f3f814e-c5a3-e91c-8c95-1bd0fac2bb3f@oracle.com> <06d7bae4-83a9-b9c5-406c-ca33cabafc50@cs.oswego.edu> Message-ID: <01B27D0E-1999-4337-9A73-CE8185831DEF@oracle.com> On Mar 14, 2018, at 11:32 AM, Doug Lea
wrote: > > looking forward to > beating my own record using expression switches even if they > (rarely) need weird breaky constructions That pushed a button: "Don't play that song, that achy breaky song..." https://en.wikipedia.org/wiki/Achy_Breaky_Song -------------- next part -------------- An HTML attachment was scrubbed... URL: From lowasser at google.com Wed Mar 14 19:38:51 2018 From: lowasser at google.com (Louis Wasserman) Date: Wed, 14 Mar 2018 19:38:51 +0000 Subject: break seen as a C archaism In-Reply-To: <99932769-0f5e-164d-8900-51b9ed8f49b4@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <99932769-0f5e-164d-8900-51b9ed8f49b4@oracle.com> Message-ID: Just to make sure we have some numbers when talking about fallthrough: - Among all switches, we calculate that 2.4% of switches in the Google codebase have some nontrivial fallthrough. (This is possibly an overestimate due to overly conservative control flow analysis, but not an underestimate.) - Determining whether switches with nontrivial fallthrough are convertible to expression switches is a little difficult in terms of control flow analysis. As the best proxy with the dataset I already scraped, I defined "convertible to expression switch" as "switches in which all cases *that provably exit* either return, or assign to the same variable", and among those, 1.2% of switches have nontrivial fallthrough. On Wed, Mar 14, 2018 at 11:32 AM Brian Goetz wrote: > > > - That we are overloading an existing control construct, "break", to mean >> something just different enough to be uncomfortable; >> > > To some degree yes, since `break ` already means something. > > > Digging deeper: If we spelled "break " differently (yield, emit, > defuse), would it be significantly different? I think reusing "return" is > worse than reusing "break", but there are other choices. (Though > introducing a new keyword has its own user-model challenges.) > > Part of it is that I know how to make sense of (a) current switch and (b) > a simple well-behaved nice expression switch that only uses `->`, but > knowing that I may have to deal with (c) code that is some mixture between > the two feels like additional level of complexity to me. Even if from an > implementation standpoint it's not. > > > I like to think that this is pedagogical, stemming from thinking of switch > expressions and switch statements as unrelated things. If we view > expression switches as a generalization of existing switch, I think that > the dichotomy between A/B can go away. But only if there is a clear enough > explanation that everyone will eventually receive. > > C is still an issue, and I do get the discomfort of mixing both -> and : > cases, and I agree that good style will minimize mixing. Outlawing mixing > entirely isn't a great answer, though; its too common to use -> for all the > cases except default, which often needs statements to do its thing. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Wed Mar 14 19:53:37 2018 From: mark at io7m.com (Mark Raynsford) Date: Wed, 14 Mar 2018 19:53:37 +0000 Subject: Record construction In-Reply-To: <753173d2-aed1-f256-c5ab-0aa294b6af4e@oracle.com> References: <977efc77-f9f5-6d7d-936e-9404dc7d4300@oracle.com> <20180314164822.026b81b6@copperhead.int.arc7.info> <753173d2-aed1-f256-c5ab-0aa294b6af4e@oracle.com> Message-ID: <20180314195337.5cdcac60@copperhead.int.arc7.info> On 2018-03-14T13:10:16 -0400 Brian Goetz wrote: > > One possibility, as discussed, is to attach these as their own thing: > "require x > 0", which is code that is narrowly targeted enough to lift. > > Another is to raise the Preconditions library to more than just a > library... To me, it seems like the former would be better in general: A boolean-valued expression, transformed to a string in a similar manner to assertions in order to provide a nice error message. I think the bytecode would probably need to wrap any exception raised by the expressions in an IllegalStateException that distinguishes it from an exception raised due to the check failing (I think a check raising its own exception should be considered to be a bug). I'd personally not want to see any particular precondition library have elevated status. Various libraries are designed to work in different environments. For example, I wrote a small thing that I tend to use everywhere now that's designed to be zero-allocation in the case of non-failing checks: https://io7m.github.io/jaffirm/ If, for example, the Preconditions library didn't make this guarantee, then I'd be hesitant to use preconditions on records at all in any code that did want that zero-allocation guarantee. The point is less about one API being preferred over another, and more about making sure that people actually do put validation everywhere because we can give assurances that the performance is good. > Either way, though, I think better support for DBC is an orthogonal > feature; I don't think it significantly constrains our ability to > deliver a records feature without it. Probably not, no. I must admit that the handling of non-null record arguments prompted me to bring this up (because adding Objects.requireNonNull() to every class I ever write adds up to a lot of boilerplate). I need to go back and read up on the current status of null handling, but I feel like it'd be nice if it could be handled via a generic "requires" mechanism along with all of the other checks - if we don't have a more terse way to say "this field must never be null" in the field declaration, of course! -- Mark Raynsford | http://www.io7m.com From john.r.rose at oracle.com Thu Mar 15 00:25:54 2018 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Mar 2018 17:25:54 -0700 Subject: Patterns and nulls In-Reply-To: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> References: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> Message-ID: <68C3406F-AEB8-47A8-9AD1-753A36B185A4@oracle.com> Good write-up; this is tricky reasoning and needs to be presented in a block, or we'll spend the next several months answering one-line emails that start "Why didn't you just?". See also: https://blogs.oracle.com/jrose/feynmans-inbox Here are a few extra footnotes, to amplify the argument. On Mar 14, 2018, at 9:58 AM, Brian Goetz wrote: > > ...The "la la la" approach has gotten us pretty far? This "la la la" works great for the null-free coding style, which many of us try to stay within. It stinks for the null-using coding style, which sometimes we choose and sometimes is forced on us by APIs for which null is a significant value. We're not going to suddenly pick a winning style, so our design must include support for both styles. "But it is tolerated so well for legacy switch" is not a valid objection, since we are making a pattern facility that goes far beyond legacy types and includes other operations which already *do* tolerate nulls (cast, instanceof). > ...But if you pull on this string a bit more, we'd also like to do the same at the top level, because we'd like to be able to refactor? And also because the null-using style is legitimate in Java. Null-haters may groan, but we must have patterns which "know about" null so null-users can gracefully thread their null values through the new constructs (where it makes sense of course). (BTW, anybody who uses Map::get is a null-user, at least briefly.) > ...So, we want at least some type patterns to match null, at least in nested contexts. Got it. (Also, null-users will object strongly if we excise null from the value space of "var" and similar constructs which don't appear to incorporate null checks in other places. They will say, "stop checking my nulls for me by default; either I want them, or I have some other coding practice for diagnosing accidental nulls". The null-haters won't be benefited much by the extra checks either; presumably they have a more aggressive set of checks embedded in their code style.) > > ? So we can _define_ `T t` to mean: > > match(T t, e : U) === U <: T ? true : e instanceof U > > In other words, a total type pattern matches null, but a partial type pattern does not. Type test patterns smuggle instanceof into the world of patterns. Instanceof, for very good reasons, does not recognize null. If you go ahead and cast the null, the language will let you, but the common combo of instanceof/checkcast, which is what narrowing type patterns embody, does not accept nulls. If you are consciously narrowing a wide reference to a narrow type, you know the narrowing might fail, and you also know you won't get nulls there. These are rules that both null-users and null-haters can live with easily. If on the other hand you are just binding a reference to the same type (or a super), then you know it's not a narrowing, and it can't fail, and so any nulls (love 'em or hate 'em) will get through. This, this design depends on the fact that programmers are usually aware (and also their IDEs are aware) of which type occurrences are partial and which are total. With patterns the distinction is a little more subtle, but we think it will be obvious in practice. (Yes, you can make puzzlers here. Even |,& expressions can pose NP-hard problems and long identifier spellings can encode messages in binary. But abusus non tollit usum; the abuse doesn't invalidate the use.) > ... > There's more work to do here to get to this statically-provable null-safe switch future, but I think this is a very positive direction. (Of course, we can't prevent NPEs from people matching against `Object o` and then dereferencing o.) (Those null-users! They get what's coming to them.) > Instanceof becomes instanceof > ----------------------------- > > The other catch is that we can't use `instanceof` to be the spelling of our `matches` operator, because it conflicts with existing `instanceof` treatment of nulls. I think that's OK; `instanceof` is a low-level primitive; matching is a high-level construct defined partially in terms of instanceof. Above I claimed that people already know the difference between types which are partial and those which are total. In the case of instanceof, ask yourself what you would think if you saw this code: if (x instanceof Object) System.out.println("got an object"); Most programmers would feel something was wrong, since the instanceof is never followed by Object. They might look upward in the file to find out what is the strange type of x. After puzzling for a while they might figure out how the code works; some wouldn't. I claim that their initial unease comes from their familiarity with Object as a total type, and finding this total type in a partial position (after "instanceof"). And of course programmers know what happens when you use a partial type as if it were total: String x = myObject; First the IDE turns red and then you get a javac error. The fix puts String into a partial construct, a cast. Someone will point out at this point that the partial/total distinction is determined by context; if "myObject" is a string, then the above assignment is OK and "instanceof String" is paradoxical in the same way as "instanceof Object". To which we can only reply, yes, you noticed. Constructs in Java rely on context for their meaning, and you don't always put extra markings on the uses to signal the meanings. It's a matter for experiment to find out whether people's sense of partial vs. total will extend to patterns. We think there will be enough contextual cues in real-world code (like, "this pattern came last and the compiler didn't complain!") to keep things straight. And IDEs will help also. So, Brian's rule for type pattern matching breaks type patterns apart in a way programmers are already experienced with, and assigns the expected behavior to each half. This behavior is neutral with respect to both null-hating and null-using code. For this reason, switches built on top of it are also neutral. What's more, their type test patterns commute correctly. That looks a lot like winning to me. Let's try it. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 15 18:11:31 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 15 Mar 2018 14:11:31 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: On 3/14/2018 2:04 PM, Kevin Bourrillion wrote: > On Wed, Mar 14, 2018 at 8:14 AM, Brian Goetz > wrote: > > In the meantime, let me probe for what's really uncomfortable > about the current design point.? Is it: > > > ?- That we are overloading an existing control construct, "break", > to mean something just different enough to be uncomfortable; > > > To some degree yes, since `break ` already means something. We had rejected this earlier for fairly obvious reasons, but let me ask to get a subjective response: would using "return x" be better? On the one hand, it's not really a return, and it doesn't build on the user intuition about the control flow aspects of break, but on the other, the return statement is already prepared to take a value, so its not adding a "new form" to the existing statement, though it is adding a new and different context.? (We abuse it slightly in lambdas, but people seem OK with this, probably because they think of lambdas as methods anyway.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Thu Mar 15 18:18:21 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 15 Mar 2018 11:18:21 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: In a world where expression switch and statement switch are two very different things, it *might* be okay to do this, but given what `return x` does in a statement switch, this is probably a much *worse* conflict that the conflict with `break label`. All options are bad... except that we rescue one of them with our probable style rule to always stick with `->` in expression switches. On Thu, Mar 15, 2018 at 11:11 AM, Brian Goetz wrote: > > > On 3/14/2018 2:04 PM, Kevin Bourrillion wrote: > > On Wed, Mar 14, 2018 at 8:14 AM, Brian Goetz > wrote: > > In the meantime, let me probe for what's really uncomfortable about the >> current design point. Is it: >> > > - That we are overloading an existing control construct, "break", to mean >> something just different enough to be uncomfortable; >> > > To some degree yes, since `break ` already means something. > > > We had rejected this earlier for fairly obvious reasons, but let me ask to > get a subjective response: would using "return x" be better? On the one > hand, it's not really a return, and it doesn't build on the user intuition > about the control flow aspects of break, but on the other, the return > statement is already prepared to take a value, so its not adding a "new > form" to the existing statement, though it is adding a new and different > context. (We abuse it slightly in lambdas, but people seem OK with this, > probably because they think of lambdas as methods anyway.) > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Thu Mar 15 18:33:42 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 15 Mar 2018 14:33:42 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> On 03/15/2018 02:11 PM, Brian Goetz wrote: >> - That we are overloading an existing control construct, "break", >> to mean something just different enough to be uncomfortable; >> >> >> To some degree yes, since `break ` already means >> something. > > We had rejected this earlier for fairly obvious reasons, but let me > ask to get a subjective response: would using "return x" be better? If you are reconsidering options, reconsider "yield", meaning "break current context with this value". Which is what we are allowing break with val to mean. Which argues for allowing either (break-val or yield-val) in lambdas as well because... > (We abuse it slightly in lambdas, but people seem OK with this, > probably because they think of lambdas as methods anyway.) -Doug From brian.goetz at oracle.com Thu Mar 15 18:50:45 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 15 Mar 2018 14:50:45 -0400 Subject: break seen as a C archaism In-Reply-To: <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> Message-ID: >> We had rejected this earlier for fairly obvious reasons, but let me >> ask to get a subjective response: would using "return x" be better? > If you are reconsidering options, reconsider "yield", meaning > "break current context with this value". Still feeling a little burned by first time we floated this, but willing to try another run up the flagpole.... In Lambda, I used the early "State of the Lambda" drafts as a means to test-drive various syntax options.? SotL 2/e floated "yield" as the get-out-of-lambda card, and I was unprepared for the degree of "you big fat stupid idiot, don't you know what yield means" response I got.? So we beat a hasty retreat from that experiment, temporarily settled on return, and then failed to circle back.? I still regret the choice of return for lambda. The primary objection to yield was from the async/await crowd that would want us to save it for that, but I don't see them as mutually exclusive (nor do I think async/await is all that likely, especially with the great work happening over in Loom). The loss of using something other than "break" is that now expression and statement switches become more obviously different beasts, which might be OK. From mark at io7m.com Thu Mar 15 19:06:40 2018 From: mark at io7m.com (Mark Raynsford) Date: Thu, 15 Mar 2018 19:06:40 +0000 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> Message-ID: <20180315190640.6c71859b@copperhead.int.arc7.info> On 2018-03-15T14:50:45 -0400 Brian Goetz wrote: > > > If you are reconsidering options, reconsider "yield", meaning > > "break current context with this value". > > Still feeling a little burned by first time we floated this, but willing > to try another run up the flagpole.... Silly idea, but... *puts on fireproof suit*: "finally x;" -- Mark Raynsford | http://www.io7m.com From guy.steele at oracle.com Thu Mar 15 19:12:56 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 15:12:56 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> Message-ID: <0EFF40FB-FB1A-496D-B72D-3F0402CECB42@oracle.com> > On Mar 15, 2018, at 2:50 PM, Brian Goetz wrote: > > >>> We had rejected this earlier for fairly obvious reasons, but let me >>> ask to get a subjective response: would using "return x" be better? >> If you are reconsidering options, reconsider "yield", meaning >> "break current context with this value". > > Still feeling a little burned by first time we floated this, but willing to try another run up the flagpole.... > > In Lambda, I used the early "State of the Lambda" drafts as a means to test-drive various syntax options. SotL 2/e floated "yield" as the get-out-of-lambda card, and I was unprepared for the degree of "you big fat stupid idiot, don't you know what yield means" response I got. So we beat a hasty retreat from that experiment, temporarily settled on return, and then failed to circle back. I still regret the choice of return for lambda. > > The primary objection to yield was from the async/await crowd that would want us to save it for that, but I don't see them as mutually exclusive (nor do I think async/await is all that likely, especially with the great work happening over in Loom). > > The loss of using something other than "break" is that now expression and statement switches become more obviously different beasts, which might be OK. I have to agree that ?yield? has too much of a history in the topics of multithreading and coroutining, giving it all the wrong connotations for our purpose here. From john.r.rose at oracle.com Thu Mar 15 19:32:08 2018 From: john.r.rose at oracle.com (John Rose) Date: Thu, 15 Mar 2018 12:32:08 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <5216FBA2-A5B4-4E04-BFDC-729A90871795@oracle.com> On Mar 15, 2018, at 11:11 AM, Brian Goetz wrote: > We had rejected this earlier for fairly obvious reasons, but let me > ask to get a subjective response: would using "return x" be better? > On the one hand, it's not really a return, and it doesn't build on the > user intuition about the control flow aspects of break, but on the > other, the return statement is already prepared to take a value, so > its not adding a "new form" to the existing statement, though it is > adding a new and different context. (We abuse it slightly in lambdas, > but people seem OK with this, probably because they think of lambdas > as methods anyway.) Here's my take on how this is going, for what it's worth. We're going round and round on this because there's isn't a comfortable spot to land. But there is a nearly comfortable spot, which is where we started: Although break's legacy syntax requires an overload for it to accept a value, the following two legacy facts pull us toward break: - break is to switch as return is to a method body (in branch behavior) - return is a branch which can be overloaded with an optional value Now, switches are becoming more like methods: They can return values. So the most direct path is to overload break "like return", in some way. What way? Well, the straightforward way works, with the usual tricks to avoid ambiguities in overloaded constructs. At that point we have added to the language by increasing symmetry, a win. Let's beware of making new language features #{ Stand Out }. It makes them look silly and puzzling when they mature. Many objections to novelties like "break x" are of two potential kinds: a. "I want the new thing to be easier to spot", b. "Programmers will never be at ease with that". People tend to use a. as a proxy for b., but as we go round and round I think many of us tend to forget about b., which is the real point. Can anyone successfully argue that "break x" liable to b.? I doubt it. Our quest for continuity and symmetry with the past is our best trick for avoiding b. (and c., "Let's fork the language"), even at the temporary cost of a. Neal Gafter says it very well: > It is better to design the feature so that it fits well with the existing > features of the language, even if it might at first seem jarring that > things have changed. Users will quickly get over the newness of > the changes and learn to understand the language as a whole as > it is (after the change). The adding of -> to the mix is an inspired move (thanks, Brian) because it repurposes another existing part of the language, and adds it as sugar for "break x", keeping continuity and improving clarity at the same time. My $0.02. ? John P.S. FTR here's the reference to Neal's blog: http://gafter.blogspot.com/2017/06/making-new-language-features-stand-out.html P.P.S. The above reasoning might lead to other places too: If we were to make loops return values, then "break" would do the same for them. Also "continue" could be overloaded to deliver a partial result to the loop control. That's sci fi at the present moment, but I'm showing it as an example how the logic works. From guy.steele at oracle.com Thu Mar 15 19:18:34 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 15:18:34 -0400 Subject: break seen as a C archaism In-Reply-To: <20180315190640.6c71859b@copperhead.int.arc7.info> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <201803151906! 40.6c71859b@copperhead.int.arc7.info> Message-ID: > On Mar 15, 2018, at 3:06 PM, Mark Raynsford wrote: > > On 2018-03-15T14:50:45 -0400 > Brian Goetz wrote: >> >>> If you are reconsidering options, reconsider "yield", meaning >>> "break current context with this value". >> >> Still feeling a little burned by first time we floated this, but willing >> to try another run up the flagpole.... > > Silly idea, but... *puts on fireproof suit*: > > "finally x;" Interestingly, the keywords `try` and `catch` and `finally` currently must each be followed by a block, so there is indeed syntactic space to use each one with a following expression instead. Which only suggests that . . . *puts on fireproof suit and then climbs into a concrete bunker and slams the door*: ?try x;? would be shorter and no sillier. ?Guy From forax at univ-mlv.fr Thu Mar 15 19:40:01 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 15 Mar 2018 20:40:01 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <201803151906!40.6c71859b@copperhead.int.arc7.info> Message-ID: <55989289.1877090.1521142801249.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "mark" > Cc: "amber-spec-experts" > Envoy?: Jeudi 15 Mars 2018 20:18:34 > Objet: Re: break seen as a C archaism >> On Mar 15, 2018, at 3:06 PM, Mark Raynsford wrote: >> >> On 2018-03-15T14:50:45 -0400 >> Brian Goetz wrote: >>> >>>> If you are reconsidering options, reconsider "yield", meaning >>>> "break current context with this value". >>> >>> Still feeling a little burned by first time we floated this, but willing >>> to try another run up the flagpole.... >> >> Silly idea, but... *puts on fireproof suit*: >> >> "finally x;" > > Interestingly, the keywords `try` and `catch` and `finally` currently must each > be followed by a block, so there is indeed syntactic space to use each one with > a following expression instead. > > Which only suggests that . . . *puts on fireproof suit and then climbs into a > concrete bunker and slams the door*: > > ?try x;? > > would be shorter and no sillier. > > ?Guy It seams too close to the try-with-resources. compare try (resource) -> { }; // a try that break/return a lambda with try (resource) { } // a try-with-resources R?mi From forax at univ-mlv.fr Thu Mar 15 19:41:35 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 15 Mar 2018 20:41:35 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <0EFF40FB-FB1A-496D-B72D-3F0402CECB42@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <0EFF40FB-FB1A-496D-B72D-3F0402CECB42@oracle.com> Message-ID: <967883205.1877210.1521142895224.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Jeudi 15 Mars 2018 20:12:56 > Objet: Re: break seen as a C archaism >> On Mar 15, 2018, at 2:50 PM, Brian Goetz wrote: >> >> >>>> We had rejected this earlier for fairly obvious reasons, but let me >>>> ask to get a subjective response: would using "return x" be better? >>> If you are reconsidering options, reconsider "yield", meaning >>> "break current context with this value". >> >> Still feeling a little burned by first time we floated this, but willing to try >> another run up the flagpole.... >> >> In Lambda, I used the early "State of the Lambda" drafts as a means to >> test-drive various syntax options. SotL 2/e floated "yield" as the >> get-out-of-lambda card, and I was unprepared for the degree of "you big fat >> stupid idiot, don't you know what yield means" response I got. So we beat a >> hasty retreat from that experiment, temporarily settled on return, and then >> failed to circle back. I still regret the choice of return for lambda. >> >> The primary objection to yield was from the async/await crowd that would want us >> to save it for that, but I don't see them as mutually exclusive (nor do I think >> async/await is all that likely, especially with the great work happening over >> in Loom). >> >> The loss of using something other than "break" is that now expression and >> statement switches become more obviously different beasts, which might be OK. > > I have to agree that ?yield? has too much of a history in the topics of > multithreading and coroutining, giving it all the wrong connotations for our > purpose here. yes ! and if Loom at some point requires a syntax instead of being a pure API, it will be unfortunate. R?mi From guy.steele at oracle.com Thu Mar 15 19:28:04 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 15:28:04 -0400 Subject: break seen as a C archaism In-Reply-To: <55989289.1877090.1521142801249.JavaMail.zimbra@u-pem.fr> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <201803151906!40.6c71859b@copperhead.int.arc7.info> <55989289.1877090.1521142801249.JavaMail.zimbra@u-pem.fr> Message-ID: <0C485CCB-929F-4B41-BCDA-26E63EB7F7A9@oracle.com> > On Mar 15, 2018, at 3:40 PM, Remi Forax wrote: > > > > ----- Mail original ----- >> De: "Guy Steele" >> ?: "mark" >> Cc: "amber-spec-experts" >> Envoy?: Jeudi 15 Mars 2018 20:18:34 >> Objet: Re: break seen as a C archaism > >>> On Mar 15, 2018, at 3:06 PM, Mark Raynsford wrote: >>> >>> On 2018-03-15T14:50:45 -0400 >>> Brian Goetz wrote: >>>> >>>>> If you are reconsidering options, reconsider "yield", meaning >>>>> "break current context with this value". >>>> >>>> Still feeling a little burned by first time we floated this, but willing >>>> to try another run up the flagpole.... >>> >>> Silly idea, but... *puts on fireproof suit*: >>> >>> "finally x;" >> >> Interestingly, the keywords `try` and `catch` and `finally` currently must each >> be followed by a block, so there is indeed syntactic space to use each one with >> a following expression instead. >> >> Which only suggests that . . . *puts on fireproof suit and then climbs into a >> concrete bunker and slams the door*: >> >> ?try x;? >> >> would be shorter and no sillier. >> >> ?Guy > > It seams too close to the try-with-resources. > > compare > try (resource) -> { }; // a try that break/return a lambda > with > try (resource) { } // a try-with-resources > > R?mi Indeed. Mine was not a serious suggestion. I agree with John Rose?s analysis: ?break x;? really does seem to be the best point in the design space, especially since you can use ?->? to hide it 98% of the time. "But I was thinking of a plan To dye one's whiskers green, And always use so large a fan That they could not be seen.? ?Lewis Carroll From forax at univ-mlv.fr Thu Mar 15 19:48:51 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 15 Mar 2018 20:48:51 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <20180315190640.6c71859b@copperhead.int.arc7.info> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <20180315190640.6c71859b@copperhead.int.arc7.info> Message-ID: <1880063835.1878032.1521143331716.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "mark" > ?: "amber-spec-experts" > Envoy?: Jeudi 15 Mars 2018 20:06:40 > Objet: Re: break seen as a C archaism > On 2018-03-15T14:50:45 -0400 > Brian Goetz wrote: >> >> > If you are reconsidering options, reconsider "yield", meaning >> > "break current context with this value". >> >> Still feeling a little burned by first time we floated this, but willing >> to try another run up the flagpole.... > > Silly idea, but... *puts on fireproof suit*: > > "finally x;" I believe we can also use any new keywords given that you can not have an identifier followed by an identifier in Java. by example pass x; quit x; end x; > > -- > Mark Raynsford | http://www.io7m.com R?mi From guy.steele at oracle.com Thu Mar 15 19:36:13 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 15:36:13 -0400 Subject: break seen as a C archaism In-Reply-To: <1880063835.1878032.1521143331716.JavaMail.zimbra@u-pem.fr> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <20180315190640.6c71859b@copperhead.int.arc7.info> <1880063835.1878032.1521143331716.JavaMail.zimbra@u-pem.fr> Message-ID: <0582A51E-FFB9-44AF-9627-84FEB0B2652C@oracle.com> > On Mar 15, 2018, at 3:48 PM, Remi Forax wrote: > > ----- Mail original ----- >> De: "mark" >> ?: "amber-spec-experts" >> Envoy?: Jeudi 15 Mars 2018 20:06:40 >> Objet: Re: break seen as a C archaism > >> On 2018-03-15T14:50:45 -0400 >> Brian Goetz wrote: >>> >>>> If you are reconsidering options, reconsider "yield", meaning >>>> "break current context with this value". >>> >>> Still feeling a little burned by first time we floated this, but willing >>> to try another run up the flagpole.... >> >> Silly idea, but... *puts on fireproof suit*: >> >> "finally x;" > > I believe we can also use any new keywords given that you can not have an identifier followed by an identifier in Java. > > by example > pass x; > quit x; > end x; Remember that in this situation (switch expressions), `x` can be any expression, not just an identifier. So ?pass x;? cannot be confused with existing syntax, but ?pass (x)? can be (looks like a method call). ?Guy From forax at univ-mlv.fr Thu Mar 15 19:57:39 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 15 Mar 2018 20:57:39 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <0582A51E-FFB9-44AF-9627-84FEB0B2652C@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <68d1a0ed-2850-dd17-70da-1d2dc0c7e500@cs.oswego.edu> <20180315190640.6c71859b@copperhead.int.arc7.info> <1880063835.1878032.1521143331716.JavaMail.zimbra@u-pem.fr> <0582A51E-FFB9-44AF-9627-84FEB0B2652C@oracle.com> Message-ID: <875201491.1878952.1521143859265.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Remi Forax" > Cc: "mark" , "amber-spec-experts" > Envoy?: Jeudi 15 Mars 2018 20:36:13 > Objet: Re: break seen as a C archaism >> On Mar 15, 2018, at 3:48 PM, Remi Forax wrote: >> >> ----- Mail original ----- >>> De: "mark" >>> ?: "amber-spec-experts" >>> Envoy?: Jeudi 15 Mars 2018 20:06:40 >>> Objet: Re: break seen as a C archaism >> >>> On 2018-03-15T14:50:45 -0400 >>> Brian Goetz wrote: >>>> >>>>> If you are reconsidering options, reconsider "yield", meaning >>>>> "break current context with this value". >>>> >>>> Still feeling a little burned by first time we floated this, but willing >>>> to try another run up the flagpole.... >>> >>> Silly idea, but... *puts on fireproof suit*: >>> >>> "finally x;" >> >> I believe we can also use any new keywords given that you can not have an >> identifier followed by an identifier in Java. >> >> by example >> pass x; >> quit x; >> end x; > > Remember that in this situation (switch expressions), `x` can be any expression, > not just an identifier. > > So ?pass x;? cannot be confused with existing syntax, but ?pass (x)? can be > (looks like a method call). > > ?Guy yes, thanks. R?mi From maurizio.cimadamore at oracle.com Thu Mar 15 21:13:29 2018 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 15 Mar 2018 21:13:29 +0000 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> Message-ID: <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> So, from a language design perspective, 'return x' is wrong - but, as you point out, we already committed the original sin of having 'return == local return' for lambdas, so I'm not too convinced that we couldn't use the same story again here. E.g. when you say 'return', what you really mean is 'returning from the innermost context'. This could be a method (as usual), or a nested expression e.g. a lambda or a switch expression. Kevin has a point in that using return is mildly worrisome when it comes to refactoring; but we had exactly the same problem with lambdas when we were considering migrating code using internal iteration (for loop) to code using external iteration (Stream forEach) - again, there the refactoring could not be 100% smooth - if the body of your loop had some abnormally completing branches, then there was no way to translate that into an external iteration idiom - at least not mechanically (e.g. 'return x' already meant something different inside old-style for loop bodies). So, seems to me that we end up with the same bag of pros and cons? E.g. more familiar to the user (return is something that they know and love), but more smelly from a design point of view (in a way that forecloses using 'return' to mean non-local return, but I wonder - has that ship already sailed?) Maurizio On 15/03/18 18:11, Brian Goetz wrote: > > > On 3/14/2018 2:04 PM, Kevin Bourrillion wrote: >> On Wed, Mar 14, 2018 at 8:14 AM, Brian Goetz > > wrote: >> >> In the meantime, let me probe for what's really uncomfortable >> about the current design point.? Is it: >> >> >> ?- That we are overloading an existing control construct, >> "break", to mean something just different enough to be uncomfortable; >> >> >> To some degree yes, since `break ` already means something. > > We had rejected this earlier for fairly obvious reasons, but let me > ask to get a subjective response: would using "return x" be better?? > On the one hand, it's not really a return, and it doesn't build on the > user intuition about the control flow aspects of break, but on the > other, the return statement is already prepared to take a value, so > its not adding a "new form" to the existing statement, though it is > adding a new and different context.? (We abuse it slightly in lambdas, > but people seem OK with this, probably because they think of lambdas > as methods anyway.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Mar 15 21:58:28 2018 From: john.r.rose at oracle.com (John Rose) Date: Thu, 15 Mar 2018 14:58:28 -0700 Subject: break seen as a C archaism In-Reply-To: <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: On Mar 15, 2018, at 2:13 PM, Maurizio Cimadamore wrote: > > So, from a language design perspective, 'return x' is wrong - but, as you point out, we already committed the original sin of having 'return == local return' for lambdas, so I'm not too convinced that we couldn't use the same story again here. E.g. when you say 'return', what you really mean is 'returning from the innermost context'. This could be a method (as usual), or a nested expression e.g. a lambda or a switch expression. > > We have method bodies and lambda bodies on one hand, and we have switches and loops on the other. We use return to escape from the former, and break to escape from the latter. Note that return may or may not take an expression, while break never does, at present. So far so good. Now we stir in expression switches. Which side of the fence do they belong on? It seems to me that your position needs to argue that e-switches belong with methods and lambdas, because only return can take an expression. If you can pull this off, then break doesn't need to take an expression. Likewise, my position need to argue that giving "break" an expression is reasonable. I don't need to argue that expression switches are similar to legacy switches. (But I'm trying to spike the argument that it's hard to unify e-switches and s-switches, so let's just fork the language with a new switch-like feature for expressions.) But there are two reasons why e-switch doesn't belong with method body and lambda body, a shallow but strong one, and a deep one. Shallow but strong: e-switches are obviously switches. Deep: Lambda bodies and method bodies execute in their own stack frames. Any up-level references must be to final locals (or fields). Lambda bodies and methods can execute at most one "return", which tears down their frame. Expressions, including expression switches, execute in the frame of the containing lambda body or method and can read *and write* local variables. Expressions are inherently local to a frame and can imperatively side effect it. A "return" which in some contexts keeps the stack frame and jumps somewhere is a weaker return than today's return. (Weaker meaning less can be concluded by observing it in code.) So I can't group e-switch cases with lambda bodies. I know some have performed this feat to their own satisfaction, but it's hard for me, in a way that seems deeper than just learning curve. By now we recognize that adding an expression to "break" is no big deal; it's a new overloading. I agree that it is open to the accusation that it's not thrifty, that "return" already does that job. But it seems to me the shallow and deep points above answer the accusation. For me, the cost of making "break" do a new trick is paid for by the benefit of not inventing a new switch-like expression (like ?: for if/else), and not having to weaken "return". ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 15 21:44:04 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 17:44:04 -0400 Subject: break seen as a C archaism In-Reply-To: <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: Okay, Maurizio, you got me thinking. As long as we are convinced that we are actually going to use an explicit value-returning statement within a switch expression quite infrequently in real code, why don?t we get the best of both worlds by spelling it this way: break return x; Then everybody is happy: (1) Cannot be confused with the old `break` syntax. (2) Clearly exits a `switch` like `break` does. (3) Clearly returns a value like `return` does. (4) Better encourages exclusive use of `->` (because using `->` rather than `: break return` saves even more characters than using `->` rather than `: break`). (5) In the year 2364, this can be further generalized to allow `continue return x;`. (6) Those who want new language features to really jump out will surely be satisfied. ?Guy > On Mar 15, 2018, at 5:13 PM, Maurizio Cimadamore wrote: > > So, from a language design perspective, 'return x' is wrong - but, as you point out, we already committed the original sin of having 'return == local return' for lambdas, so I'm not too convinced that we couldn't use the same story again here. E.g. when you say 'return', what you really mean is 'returning from the innermost context'. This could be a method (as usual), or a nested expression e.g. a lambda or a switch expression. > > Kevin has a point in that using return is mildly worrisome when it comes to refactoring; but we had exactly the same problem with lambdas when we were considering migrating code using internal iteration (for loop) to code using external iteration (Stream forEach) - again, there the refactoring could not be 100% smooth - if the body of your loop had some abnormally completing branches, then there was no way to translate that into an external iteration idiom - at least not mechanically (e.g. 'return x' already meant something different inside old-style for loop bodies). > > So, seems to me that we end up with the same bag of pros and cons? E.g. more familiar to the user (return is something that they know and love), but more smelly from a design point of view (in a way that forecloses using 'return' to mean non-local return, but I wonder - has that ship already sailed?) > > Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Mar 15 22:06:51 2018 From: john.r.rose at oracle.com (John Rose) Date: Thu, 15 Mar 2018 15:06:51 -0700 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: <84E6F885-E7E2-4F9D-9E14-32D9B3BF26FC@oracle.com> On Mar 15, 2018, at 2:44 PM, Guy Steele wrote: > > > break return x; > > Then everybody is happy: > (1) Cannot be confused with the old `break` syntax. > (2) Clearly exits a `switch` like `break` does. > (3) Clearly returns a value like `return` does. > (4) Better encourages exclusive use of `->` (because using `->` rather than `: break return` saves even more characters than using `->` rather than `: break`). > (5) In the year 2364, this can be further generalized to allow `continue return x;`. > (6) Those who want new language features to really jump out will surely be satisfied. Not bad. It also doesn't weaken "plain return" in the way I was worried about. I would have numbered that last point (-1), though. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Mar 15 22:37:10 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 15 Mar 2018 23:37:10 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <84E6F885-E7E2-4F9D-9E14-32D9B3BF26FC@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> <84E6F885-E7E2-4F9D-9E14-32D9B3BF26FC@oracle.com> Message-ID: <1430569126.1899355.1521153430458.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "Guy Steele" > Cc: "amber-spec-experts" > Envoy?: Jeudi 15 Mars 2018 23:06:51 > Objet: Re: break seen as a C archaism > On Mar 15, 2018, at 2:44 PM, Guy Steele < [ mailto:guy.steele at oracle.com | > guy.steele at oracle.com ] > wrote: >> break return x; >> Then everybody is happy: >> (1) Cannot be confused with the old `break` syntax. >> (2) Clearly exits a `switch` like `break` does. >> (3) Clearly returns a value like `return` does. >> (4) Better encourages exclusive use of `->` (because using `->` rather than `: >> break return` saves even more characters than using `->` rather than `: >> break`). >> (5) In the year 2364, this can be further generalized to allow `continue return >> x;`. >> (6) Those who want new language features to really jump out will surely be >> satisfied. > Not bad. It also doesn't weaken "plain return" in the > way I was worried about. > I would have numbered that last point (-1), though. > ? John i think, we're missing a 'do' just to be sure, do break return x; R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 15 22:38:38 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 15 Mar 2018 18:38:38 -0400 Subject: break seen as a C archaism In-Reply-To: <1430569126.1899355.1521153430458.JavaMail.zimbra@u-pem.fr> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> <84E6F885-E7E2-4F9D-9E14-32D9B3BF26FC@oracle.com> <1430569126.1899355.1521153430458.JavaMail.zimbra@u-pem.fr> Message-ID: <846bdaec-4e94-abe9-cc46-806a3492e864@oracle.com> At this point, the Colonel from Monty Python breaks in, and shuts us down for being too silly.... On 3/15/2018 6:37 PM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *De: *"John Rose" > *?: *"Guy Steele" > *Cc: *"amber-spec-experts" > *Envoy?: *Jeudi 15 Mars 2018 23:06:51 > *Objet: *Re: break seen as a C archaism > > On Mar 15, 2018, at 2:44 PM, Guy Steele > wrote: > > > break return x; > > Then everybody is happy: > (1) Cannot be confused with the old `break` syntax. > (2) Clearly exits a `switch` like `break` does. > (3) Clearly returns a value like `return` does. > (4) Better encourages exclusive use of `->` (because using > `->` rather than `: break return` saves even more characters > than using `->` rather than `: break`). > (5) In the year 2364, this can be further generalized to allow > `continue return x;`. > (6) Those who want new language features to really jump out > will surely be satisfied. > > > Not bad. ?It also doesn't weaken "plain return" in the > way I was worried about. > > I would have numbered that last point (-1), though. > > ? John > > > i think, we're missing a 'do' just to be sure, > ? do break return x; > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 16 00:01:46 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 15 Mar 2018 20:01:46 -0400 Subject: Patterns and nulls In-Reply-To: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> References: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> Message-ID: <7c6088f7-d7f2-4386-ff3a-00e38ff4b5df@oracle.com> After going over this in more detail, I have some simplifications we can apply to this regarding null, default, and exhaustiveness. Key observation: we were subtly taking sides on the accept nulls vs reject nulls divide, and that left us in a hard-to-{explain,defend} place.? Instead we should make it so that both null-lovers and null-haters have reasonable idioms for doing their favorite thing.? (And we almost started doing the same thing with exhaustiveness for statement switches, but we pulled back from that precipice.) The basic changes to the below story are: ?- We don't need the distinction between constant switches and pattern switches, except in some tiny corners of the spec; ?- Default need not be shuffled out to retirement; it means "anything else, except null", and can be used in any switch; ?- We define a category of _types_ that are "exhaustible", which means its possible to cover all the values without a catch-all. This include enums, some primitives (definitely boolean, maybe byte, eventually maybe others, especially if we add ranges), and, when we have them, sealed types. ?- A switch is exhaustive if (a) it is over an exhaustive type and all cases are explicitly covered, (b) it contains a "default" case, or (c) it contains a total pattern case. ?- A switch NPEs on null if it contains neither "case null" nor a total type test pattern. ?- An expression switch over an exhaustive type that contains neither a default nor total type pattern will throw some exception (MatchException, ICCE, etc) when it encounters a target that didn't exist at compile time (an enum constant or sealed type member added later via separate compilation). All the rest still holds. What was missing, and what made me uncomfortable, was not that we were biasing "case Object" towards including null, but not having a way to finish the switch with "everything, except null" -- which is what is natural in some cases.? So now, the null-lovers can use an explicit case null, or use "Object" (or var or _ or some other total pattern) at the bottom of their switches and the nulls are lumped in there.? The null-haters can keep using default, and the switch NPEs just like it always did.? And its fairly easy to look at a switch (does it have case null, or a null-accepting total pattern (which is always last)?) and tell what it does on null. There's a similar divide on exhaustiveness for statement switches.? Some people think biasing towards exhaustiveness is a good idea; some think that switch is just like "if", and not all if's need an "else", and any push towards exhaustiveness is meddling.? For the exhaustiveness-lovers, they can manually enumerate all the cases (enums, sealed type members), and have a throwing default which will detect unexpected targets that would be impossible at compile time.? (At some point, maybe we'll help them by adding support for "default: unreachable", which would provide not only runtime detection but would enlist the compiler's flow analysis as well.)? For the non-exhaustiveness-fans, just don't use default or other total pattern.? Everyone can get what they want. On 3/14/2018 12:58 PM, Brian Goetz wrote: > In the message "More on patterns, generics, null, and primitives", > Gavin outlines how these constructs will be treated in pattern > matching.? This mail is a refinement of that, specifically, to refine > how nulls are treated. > > Rambling Background Of Why This Is A Problem At All > --------------------------------------------------- > > Nulls will always be a source of corner cases and surprises, so the > best we can likely do is move the surprises around to coincide with > existing surprise modes.? One of the existing surprise modes is that > switches on reference types (boxes, strings, and enums) currently > always NPE when passed a null.? You could characterize switch's > current treatment of null as "La la la can't hear you la la la."? (I > think this decision was mostly made by frog-boiling; in Java 1.0, > there were no switches on reference types, so it was not an issue; > when switches on boxes was added, it was done by appeal to > auto-unboxing, which throws on null, and null enums are rare enough > that no one felt it was important enough to do something different for > them.? Then when we added string switch in 7, we were already mostly > sliding the slippery slope of past precedent.) > > The "la la la" approach has gotten us pretty far, but I think finally > runs out of gas when we have nested patterns.? It might be OK to NPE > when x = null here: > > ??? switch (x) { > ??????? case String: ... > ??????? case Integer: ... > ??????? default: ... > ??? } > > but it is certainly not OK to NPE when b = new Box(null): > > ??? switch (b) { > ??????? case Box(String s): ... > ??????? case Box(Integer i): ... > ??????? case Box(Object o): ... > ??? } > > since `Box(null)` is a perfectly reasonable box.? (Which of these > patterns matches `Box(null)` is a different story, see below.)? So > problem #1 with is that we need a way to match nulls in nested > patterns; having nested patterns throw whenever any intermediate > binding produces null would be crazy.? So, we have to deal with nulls > in this way.? It seems natural, therefore, to be able to confront it > directly: > > ??? case Box(null): ... > > which is just an ordinary nested pattern, where our target matches > `Box(var x)` and further x matches null.? Which means `x matches null` > need to be a thing, even if switch is hostile to nulls. > > But if you pull on this string a bit more, we'd also like to do the > same at the top level, because we'd like to be able to refactor > > ??? switch (b) { > ??????? case Box(null): ... > ??????? case Box(Candy): ... > ??????? case Box(Object): ... > ??? } > > into > > ??? switch (b) { > ??????? case Box(var x): > ? ? ? ?? ?? switch (x) { > ? ? ? ?? ?????? case null: ... > case Candy: ... > case Object: ... > ??????????? } > ??? } > > with no subtle semantics changes.? I think this is what users will > expect, and cutting them on sharp edges here wouldn't be doing them > favors. > > > Null and Type Patterns > ---------------------- > > The previous iteration outlined in Gavin's mail was motivated by a > sensible goal, but I think we took it a little too literally. Which is > that if I have a `Box(null)`, it should match the following: > > ??? case Box(var x): > > because it would be weird if `var x` in a nested context really meant > "everything but null."? This led us to the position that > > ??? case Box(Object o): > > should also match `Box(null)`, because `var` is just type inference, > and the compiler infers `Object` here from the signature of the `Box` > deconstructor.? So `var` and the type that gets inferred should be > treated the same.? (Note that Scala departs from this, and the results > are pretty confusing.) > > You might convince yourself that `Box(Object)` not matching > `Box(null)` is not a problem, just add a case to handle null, with an > OR pattern (aka non-harmful fallthrough): > > ??? case Box(null): // fall through > ??? case Box(Object): ... > > But, this only works in the simple case.? What if my Box deconstructor > had four binding variables: > > ??? case Box(P, Q, R, S): > > Now, to capture the same semantics, you need four more cases: > > ??? case Box(null, Q, R, S): // fall through > ??? case Box(P, null, R, S):// fall through > ??? case Box(P, Q, null, S): // fall through > ??? case Box(P, Q, R, null): // fall through > ??? case Box(P, Q, R, S): > > But wait, it gets worse, since if P and friends have binding > variables, and the null pattern does not, the binding variables will > not be DA and therefore not be usable.? And if we graft binding > variables onto constant patterns, we have a potential typing problem, > since the type of merged binding variables in OR patterns should > match.? So this is a tire fire, let's back away slowly. > > So, we want at least some type patterns to match null, at least in > nested contexts.? Got it. > > This led us to: a type pattern `T t` should match null.? But clearly, > in the switch > > ??? switch (aString) { > ??????? case String s: ... > ??? } > > it NPEs (since that's what it does today.)? So we moved the null > hostility to `switch`, which involved an analysis of whether `case > null` was present.? As Kevin pointed out, that was pretty confusing > for the users to keep track of.? So that's not so good. > > Also not so good: if type patterns match null, then the dominance > order rule says you can't put a `case null` arm after a type pattern > arm, because the `case null` will be dead.? (Just like you can't catch > `IOException` after catching `Throwable`.)? Which deprived case null > of most of its remaining usefulness, which is: lump null in with the > default.? If users want to use `case null`, they most likely want this: > > ??? switch (o) { > ??????? case A: ... > ??????? case B: ... > ??????? case null: // fall through > ??????? default: > ??????????? // deal with unexpected values > ??? } > > If we can't do that -- which the latest iteration said we can't -- its > pretty useless.? So, we got something wrong with type patterns too.? > Tricky buggers, these nulls! > > > Some Problems With the Current Plan > ----------------------------------- > > The current plan, even though it came via a sensible path, has lots of > problems.? Including: > > ?- Its hard to reason about which switches throw on null and which > don't.? (This will never be easy, but we can make it less hard.) > ?- We have asymmetries between nested and non-nested patterns; if we > unroll a nested pattern to a nested switch, the semantics shift subtly > out from under us. > ?- There's no way to say "default including null", which is what > people would actually want to do if they had explicit control over > nulls.? Having `String s` match null means our ordering rules force > the null case too early, depriving us of the ability to lump it in > with another case. > > Further, while the intent of `Box(var x)` matches `Box(null)` was > right, and that led us to `Box(Object)` matches `Box(null)`, we didn't > pull this string to the end.? So let's break some assumptions and > start over. > > Let's assume we have the following declarations: > > ??? record Box(Object); > ??? Object o; > ??? String s; > ??? Box b; > > Implicitly, `Box` has a deconstruction pattern whose signature is > `Box(out Object o)`. > > What will users expect on the following? > > ??? Box b = new Box(null); > ??? switch (b) { > ??????? case Box(Candy x): ... > ??????? case Box(Frog f): ... > ??????? case Box(Object o): ... > ??? } > > There are four non-ridiculous possibilities: > ?- NPE > ?- Match none > ?- Match Box(Candy) > ?- Match Box(Object) > > I argued above why NPE is undesirable; I think matching none of them > would also be pretty surprising, since `Box(null)` is a perfectly > reasonable element of the value set decribed by the pattern > `Box(Object)`.? If all type patterns match null, we'd match > `Box(Candy)` -- but that's pretty weird and arbitrary, and probably > not what the user expects.? It also means -- and this is a serious > smell -- that we couldn't freely reorder the independent cases > `Box(Candy)` and `Box(Frog)` without subtly altering behavior.? Yuck! > > So the only reasonable outcome is that it matches `Box(Object)`. We'll > need a credible theory why we bypass the candy and the frog buckets, > but I think this is what the user will expect -- `Box(Object)` is our > catch-all bucket. > > A Credible Theory > ----------------- > > Recall that matching a nested pattern `x matches Box(P)` means: > > ??? x matches Box(var alpha) && alpha matches P > > The theory by which we can reasonably claim that `Box(Object)` matches > `Box(null)` is that the nested pattern `Object` is _total_ on the type > of its target (alpha), and therefore can be statically deemed to match > without additional dynamic checks.? In > > ??????? case Box(Candy x): ... > ??????? case Box(Frog f): ... > ??????? case Box(Object o): ... > > the first two cases require additional dynamic type tests (instanceof > Candy / Frog), but the latter, if the target is a `Box` at all, > requires no further dynamic testing.? So we can _define_ `T t` to mean: > > ??? match(T t, e : U) === U <: T ? true : e instanceof U > > In other words, a total type pattern matches null, but a partial type > pattern does not.? That's great for the type system weenies, but does > it help the users?? I claim it does. It means that in: > > ??? Box b = new Box(null); > ??? switch (b) { > ??????? case Box(Candy x): ... > ??????? case Box(Frog f): ... > ??????? case Box(Object o): ... > ??? } > > We match `Box(Object)`, which is the catch-all `Box` handler. We can > freely reorder the first two cases, because they're unordered by > dominance, but we can't reorder either of them with `Box(Object)`, > because that would create a dead case arm. `Box(var x)` and `Box(T x)` > mean the same thing when `T` is the type that inference produces. > > So `Box(Candy)` selects all boxes known to contain candy; `Box(Frog)` > all boxes known to contain frogs; `Box(null)` selects a box containing > null, and `Box(_)` or `Box(var x)` or `Box(Object o)` selects all boxes. > > Further, we can unroll the above to: > > ??? Box b = new Box(null); > ??? switch (b) { > ??????? case Box(var x): > switch (x) { > case Candy c: ... > case Frog f: ... > case Object o: ... > ??????????? } > ??? } > > and it means _the same thing_; the nulls flow into the `Object` catch > basin, and I can still freely recorder the Candy/Frog cases. Whew. > This feels like we're getting somewhere. > > We can also now flow the `case null` down to where it falls through > into the "everything else" bucket, because type patterns no longer > match nulls.? If specified at all, this is probably where the user > most wants to put it. > > Note also that the notion of a "total pattern" (one whose > applicability, possibly modulo null, can be determined statically) > comes up elsewhere too.? We talked about a let-bind statement: > > ?? let Point(var x, var y) = p > > In order for the compiler to know that an `else` is not required on a > let-bind, the pattern has to be total on the static type of the > target.? So this notion of totality is a useful one. > > Where totality starts to feel uncomfortable is the fact that while > null _matches_ `Object o`, it is not `instanceof Object`.? More on > this later. > > This addresses all the problems we stated above, so what's the problem? > > Default becomes legacy > ---------------------- > > The catch is that the irregularity of `default` becomes even more > problematic.? The cure is we give `default` a gold watch, thank it for > its services, and grant it "Keyword Emeritus" status. > > What's wrong with default?? First, it's syntactically irregular. It's > not a pattern, so doesn't easily admit nesting or binding variables.? > And second, its semantically irregular; it means "everything else (but > not null!)"? Which makes it a poor catch-all.? We'd like for our > catch-all case -- the one that dominates all other possible cases -- > to catch everything.? We thought we wanted `default` to be equivalent > to a total pattern, but default is insufficiently total. > > So, let's define a _constant switch_ as one whose target is the > existing constant types (primitives, their boxes, strings, and enums) > and whose labels are all constants (the latter condition might not be > needed).? In a constant switch, retcon default to mean "all the > constants I've not explicitly enumerated, except null."? (If you want > to flow nulls into the default bin too, just add an explicit `case > null` to fall into default, _or_ replace `default` with a total > pattern.)? We act as if that constant switches have an implicit "case > null: NPE" _at the bottom_.? If you don't handle null explicitly (a > total pattern counts as handling it explicitly), you fall into that > bucket. > > Then, we _ban_ default in non-constant switches.? So if you want > patterns, swap your old deficient `default` for new shiny total > patterns, which are a better default, and are truly exhaustive (rather > than modulo-null exhaustive).? If we can do a little more to express > the intention of exhaustiveness for statement switches (which are not > required to be exhaustive), this gives us a path to "switches never > throw NPE if you follow XYZ rules." > > There's more work to do here to get to this statically-provable > null-safe switch future, but I think this is a very positive > direction.? (Of course, we can't prevent NPEs from people matching > against `Object o` and then dereferencing o.) > > Instanceof becomes instanceof > ----------------------------- > > The other catch is that we can't use `instanceof` to be the spelling > of our `matches` operator, because it conflicts with existing > `instanceof` treatment of nulls.? I think that's OK; `instanceof` is a > low-level primitive; matching is a high-level construct defined > partially in terms of instanceof. > > From guy.steele at oracle.com Fri Mar 16 00:22:36 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 20:22:36 -0400 Subject: break seen as a C archaism In-Reply-To: <846bdaec-4e94-abe9-cc46-806a3492e864@oracle.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> <84E6F885-E7E2-4F9D-9E14-32D9B3BF26FC@oracle.com> <1430569126.1899355.1521153430458.JavaMail.zimbra@u-pem.fr> <846bdaec-4e94-abe9-cc46-806a3492e864@oracle.com> Message-ID: Well, actually, Brian, I now realize that I had my tongue in only _one_ of my cheeks. Sleep on it and then see what you think. > On Mar 15, 2018, at 6:38 PM, Brian Goetz wrote: > > At this point, the Colonel from Monty Python breaks in, and shuts us down for being too silly.... > > On 3/15/2018 6:37 PM, Remi Forax wrote: >> >> >> De: "John Rose" >> ?: "Guy Steele" >> Cc: "amber-spec-experts" >> Envoy?: Jeudi 15 Mars 2018 23:06:51 >> Objet: Re: break seen as a C archaism >> On Mar 15, 2018, at 2:44 PM, Guy Steele > wrote: >> >> >> break return x; >> >> Then everybody is happy: >> (1) Cannot be confused with the old `break` syntax. >> (2) Clearly exits a `switch` like `break` does. >> (3) Clearly returns a value like `return` does. >> (4) Better encourages exclusive use of `->` (because using `->` rather than `: break return` saves even more characters than using `->` rather than `: break`). >> (5) In the year 2364, this can be further generalized to allow `continue return x;`. >> (6) Those who want new language features to really jump out will surely be satisfied. >> >> Not bad. It also doesn't weaken "plain return" in the >> way I was worried about. >> >> I would have numbered that last point (-1), though. >> >> ? John >> >> i think, we're missing a 'do' just to be sure, >> do break return x; >> >> R?mi >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 16 01:59:19 2018 From: john.r.rose at oracle.com (John Rose) Date: Thu, 15 Mar 2018 18:59:19 -0700 Subject: Patterns and nulls In-Reply-To: <7c6088f7-d7f2-4386-ff3a-00e38ff4b5df@oracle.com> References: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> <7c6088f7-d7f2-4386-ff3a-00e38ff4b5df@oracle.com> Message-ID: <24B7CD49-8F56-43C4-8008-F45F1E9B2E29@oracle.com> On Mar 15, 2018, at 5:01 PM, Brian Goetz wrote: > > For the exhaustiveness-lovers, they can manually enumerate all the cases (enums, sealed type members), and have a throwing default which will detect unexpected targets that would be impossible at compile time. (At some point, maybe we'll help them by adding support for "default: unreachable", which would provide not only runtime detection but would enlist the compiler's flow analysis as well.) I'm one of those exhaustiveness lovers, because I'm afraid of accidental fallout, which is what happens in today's switches when a surprise value comes through. I'm happy that expression switches will exclude fallout robustly. I'm also content that I can insert a throw in my statement switch to robustly exclude fallout, and happy that this throw is likely to be a simple notation, probably "default: throw;". Here are some more details about this: This use case implies a peculiar thing about the "throwing default": The compiler *may* be able to prove that the inserted throw is unreachable. It must not complain about it, however, since I have a legitimate need to mark my switch as protected against future accidents. In fact, javac treats enum switches with such open-mindedness. In the case of enums, the future accident would be a novel enum value coming through, after a separate recompilation introduced it. (Think of somebody adding Yule or Duodecember to enum Month.) As an exhaustiveness lover, I use a "throwing default" to button up my switch against such future novelties. The use case also implies that the throw statement must not be just any old thing (throw new AssertionError("oops")), but should align with the language's internal story of how to deal with these problems in expression switches. The notation should also be shorter than a normal throw expression, or programmers will find it hard to write and read, and be tempted to leave it out even when it would make the code more robust. Also, it has to be a throw of some sort, because it cannot allow execution to continue after it, lest the compiler complain about fallout from an enclosing method body or a path by which a blank local is left unassigned. I.e., the notation needs a special pass from the reachability police. This leads us to the following syntax, or one like it: switch ((ColorChannel)c) { case R: return red(); case G: return green(); case B: return blue(); default: throw; //not reached, no fallout ever } This would be a the refactoring of the corresponding expression switch: return switch ((ColorChannel)c) { case R-> red(); case G-> green(); case B-> blue(); //optional, can omit: default: throw; }; There's a trick where Java programmers can lean on DA rules to protect against fallout. Here's a third refactoring of those switches which uses this trick: int result; switch ((ColorChannel)c) { case R: result = red(); break; case G: result = green(); break; case B: result = blue(); break; default: throw; //not reached, no fallout ever } return result; (A very subtle bug can arise if result has an initializer, say "result=null", or if it is an object field, and there is no default. Then fallout is probably unexpected, and the user could be in trouble if a novel enum shows up in the future. Most coders want to avoid such traps if they can, and the language should help. That's another reason for a concise "default: throw" notation. The DA tricks and the tracking of live code paths, don't always diagnose unexpected fallout.) The compiler can and should choose the same error path for all these switches, one which will diagnose the problem adequately, and lead to suitable remedies like recompilation. The user should not be expected to compose an adequate exception message for responding to this corner case. (Analogy: What if we required every division statement to be accompanied by an exception message to use when a zero divisor occurred?) As a shortcut, programmers often use another trick, which inserts "default:" before the case label of one of the enum values. This relieves the programmer of concocting an exception expression, and is probably the easy way out that we take most often, but it is not always a robust answer. If Yule shows up, he'll be treated just like December, or whatever random month the programmer stuck the default label on to placate the reachability police. It would be better if the throw were easy to write and read; Yule would be welcomed in with the appropriate diagnostic rather than a silent miscalculation. Of course many switches are not exhaustive, and users *expect* fallout to happen. Leaving out the "default" selects this behavior (except for null, but then there's "case null:break"). If the switch looks exhaustive, there might be some doubt about whether the programmer intended exhaustiveness. In such situations a compiler warning might be helpful, and an IDE intention would certainly be helpful. The switch can be disambiguated by adding either "default: throw" or "default: break" (the latter confirming the implicit fallout). The postures towards exhaustiveness and nulls are independent and can be dealt with in detail with other constructs. Here are the use cases: default: break; //NPE else fallout: legacy behavior /*nothing*/ // same as "default: break" but did you really mean it? default: throw; //never fallout; this is built into expression switches case Object: break; //fallout includes null case null: break; //same, but are you expecting exhaustiveness? case null: break; default: throw; //fallout on null but otherwise exhaustive Different styles of coding will use different formulations. Finally, I want to point out that there are two good reasons for thinking hard about exhaustiveness at this point, and why we can't just postpone it for future work: 1. Expression switches *must* be exhaustive, so we need to define all the checks, translation strategies, and runtime exceptions that are entailed. It's a reasonable consistency play to cross-apply the relevant goodies to statement switches. 2. We are greatly complicating the sub-language of case labels, and we expect to add new kinds of "exhaustible" types to switches. In the past we expected users to reason about fallout behaviors by inspecting the switch cases and making reasonable conclusions about need for a default. That will become harder to do as case labels become overlapping and more intertwined with the type system. It's a good time to define an easy-to-use "seat belt" (or lead apron?) to protect against unexpected fallout. ? John P.S. I found a discussion about exhaustive enum switches here: https://stackoverflow.com/questions/5013194/why-is-default-required-for-a-switch-on-an-enum-in-this-code It discusses the need to put a "default: throw" in such switches, since the language currently does not observe that enum is an exhaustible type. JLS 14.11 says: "A Java compiler is encouraged (but not required) to provide a warning if a switch on an enum-valued expression lacks a default label and lacks case labels for one or more of the enum's constants. Such a switch will silently do nothing if the expression evaluates to one of the missing constants." That is one kind of linty warning that might help. But a more subtle one would be, "your switch looks exhaustive, but you are not protecting against future novelty values." Fixing that warning would prevent some really subtle bugs, and that's a good job for "default: throw". -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Mar 16 02:06:06 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 15 Mar 2018 22:06:06 -0400 Subject: Patterns and nulls In-Reply-To: <24B7CD49-8F56-43C4-8008-F45F1E9B2E29@oracle.com> References: <0cb9a653-d087-e4a2-da22-6d5395305580@oracle.com> <7c6088f7-d7f2-4386-ff3a-00e38ff4b5df@oracle.com> <24B7CD49-8F56-43C4-8008-F45F1E9B2E29@oracle.com> Message-ID: <9E5D9F92-3F9B-4E5C-95B2-CA6C344B30D6@oracle.com> > On Mar 15, 2018, at 9:59 PM, John Rose wrote: > . . . > (Think of somebody adding Yule or Duodecember to enum Month.) Febtober! From peter.levart at gmail.com Fri Mar 16 08:50:29 2018 From: peter.levart at gmail.com (Peter Levart) Date: Fri, 16 Mar 2018 09:50:29 +0100 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: Hi, May I jump in as an outsider and someone who's just using the language... On 03/15/18 22:58, John Rose wrote: > On Mar 15, 2018, at 2:13 PM, Maurizio Cimadamore > > wrote: >> >> So, from a language design perspective, 'return x' is wrong - but, as >> you point out, we already committed the original sin of having >> 'return == local return' for lambdas, so I'm not too convinced that >> we couldn't use the same story again here. E.g. when you say >> 'return', what you really mean is 'returning from the innermost >> context'. This could be a method (as usual), or a nested expression >> e.g. a lambda or a switch expression. >> >> > We have method bodies and lambda bodies on one hand, > and we have switches and loops on the other. Yes, and my intuitive distinction between those two kinds of constructs is that the first are just "declarations" of code blobs, while the second are code blobs that execute in-line with the surrounding code. It is therefore very intuitive for me to have two kinds of syntax for exiting the constructs - "return" for the first and "break" for the second. I don't know why others find break so archaic. When I 1st saw this proposal, I thought that break was very intuitive choice for e-switch. > > We use return to escape from the former, and break to > escape from the latter. > > Note that return may or may not take an expression, > while break never does, at present. > > So far so good. ?Now we stir in expression switches. > Which side of the fence do they belong on? > > It seems to me that your position needs to argue > that e-switches belong with methods and lambdas, > because only return can take an expression. > If you can pull this off, then break doesn't need > to take an expression. > > Likewise, my position need to argue that giving "break" an > expression is reasonable. ?I don't need to argue > that expression switches are similar to legacy > switches. ?(But I'm trying to spike the argument > that it's hard to unify e-switches and s-switches, > so let's just fork the language with a new switch-like > feature for expressions.) > > But there are two reasons why e-switch doesn't > belong with method body and lambda body, > a shallow but strong one, and a deep one. > > Shallow but strong: ?e-switches are obviously switches. > > Deep: ?Lambda bodies and method bodies execute > in their own stack frames. ?Any up-level references > must be to final locals (or fields). ?Lambda bodies > and methods can execute at most one "return", > which tears down their frame. ?Expressions, > including expression switches, execute in the > frame of the containing lambda body or method > and can read *and write* local variables. > Expressions are inherently local to a frame > and can imperatively side effect it. That's another, more technical way of saying: methods and lambdas are declarations of code, switches and loops are in-line constructs that execute "immediately" in the surrounding context. Lambdas do "capture" surrounding context, but they don't execute in it (they can't modify locals, do long returns etc.). Speaking of long returns... If return was used for "yielding" a result from e-switch, how is one supposed to do a return from a method inside the e-switch: int m(int x) { ??? int y = switch (x) { ??? ??? case 1: return 12; // I want to return from m() here! ??? } } > > A "return" which in some contexts keeps the > stack frame and jumps somewhere is a weaker > return than today's return. ?(Weaker meaning > less can be concluded by observing it in code.) > > So I can't group e-switch cases with lambda bodies. > I know some have performed this feat to their own > satisfaction, but it's hard for me, in a way that > seems deeper than just learning curve. > > By now we recognize that adding an expression > to "break" is no big deal; it's a new overloading. > I agree that it is open to the accusation that it's not > thrifty, that "return" already does that job. > But it seems to me the shallow and deep points > above answer the accusation. > > For me, the cost of making "break" do a new > trick is paid for by the benefit of not inventing > a new switch-like expression (like ?: for if/else), > and not having to weaken "return". > > ? John I totally agree. There are some caveats though. What to do in situations like this, for example: int var_or_label = 13; int y = switch (x) { ??? case 1: ??? ??? var_or_label: { ??? ??? ??? break var_or_label; ??????? } ??? ??? // do we reach here? }; The standard means that Java took to avoid ambiguities caused by new features was to prioritize old behavior (varargs for example). In above sample, label would take precedence. It is easy to choose the var: ??? break (var_or_label); And now for something completely different... I think that with introduction of e-switch, we might soon see it being "abused" for things like: doSomething( ??? par1, ??? switch (1) { case 1: ??? ??? // compute result... ?? ??? break resut; ??? }, ??? par3 ); Are there any plans for such construct "without the boilerplate" ?-) Among existing reserved words, "do" seems most appropriate: doSomething( ??? par1, ??? do { ??? ??? // compute result... ?? ??? break resut; ??? } while (false), ??? par3 ); And if "while (false)" could be optional, we get: doSomething( ??? par1, ??? do { ??? ??? // compute result... ?? ??? break resut; ??? }, ??? par3 ); Combining with lambdas, we get 3 ways to do the same thing: x -> y x -> { return y; } x -> do { break y; } Regards, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.levart at gmail.com Fri Mar 16 09:39:05 2018 From: peter.levart at gmail.com (Peter Levart) Date: Fri, 16 Mar 2018 10:39:05 +0100 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: <5e222490-5865-1231-be7c-df27e1739518@gmail.com> Expanding on do... On 03/16/18 09:50, Peter Levart wrote: > And if "while (false)" could be optional, we get: Or better yet, make "while (true)" optional even in statement do, so we can finally get away with some more boilerplate: for (;;) { } or while (true) { } and simply do: do { } For e-do, the choice of default "while (true)" is fine, because it aligns with the fact that there has to be a break somewhere to exit it anyway because it has to yield a result. But there will be some danger that a programmer codes an infinite loop by mistake. > > doSomething( > ??? par1, > ??? do { > ??? ??? // compute result... > ?? ??? break resut; > ??? }, > ??? par3 > ); > Expanding on e-do... It could be a building block for e-switch. Remi is advocating for expression-only case(s) in e-switch. Combined with e-do, we could write: int y = switch (x) { ??? case 1 -> 2; ??? case 2 -> 3; ??? case 3 -> do { ??? ??? r = ...; ??? ??? break r; ??? }; }; What we loose here is fallthrough. And we still have "break" here too. It's unfortunate that we couldn't find a way to have an e-{block} of a kind when lambdas have been conceived. That way we could get away with expression-only lambdas and expression-only cases in e-switch using the same building block. But I guess the statement lambda has its weight with its "return" sticking out and reminding us about the scope. Regards, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 16 09:56:18 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 16 Mar 2018 10:56:18 +0100 (CET) Subject: break seen as a C archaism In-Reply-To: <5e222490-5865-1231-be7c-df27e1739518@gmail.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> <5e222490-5865-1231-be7c-df27e1739518@gmail.com> Message-ID: <1269469193.2037181.1521194178178.JavaMail.zimbra@u-pem.fr> Hi Peter, I think this have been ruled out but Brian saying that we do not want to add a block that ends with an expression in Java. And i think we can use parenthesis to avoid to re-interpret what a int value = do( int x = foo(); break x * x; ); If we go in that direction, i think i prefer the comprehension-like syntax where you put the expression giving the result first and then the block that calculates the value, it has the advantage of being an expression, so it fits with the arrow syntax switch(value) { case 0 -> 0 case 1 -> x * x with { x = foo(); } } regards, R?mi > De: "Peter Levart" > ?: "John Rose" , "Maurizio Cimadamore" > > Cc: "amber-spec-experts" > Envoy?: Vendredi 16 Mars 2018 10:39:05 > Objet: Re: break seen as a C archaism > Expanding on do... > On 03/16/18 09:50, Peter Levart wrote: >> And if "while (false)" could be optional, we get: > Or better yet, make "while (true)" optional even in statement do, so we can > finally get away with some more boilerplate: > for (;;) { > } > or > while (true) { > } > and simply do: > do { > } > For e-do, the choice of default "while (true)" is fine, because it aligns with > the fact that there has to be a break somewhere to exit it anyway > because it has to yield a result. But there will be some danger that a > programmer codes an infinite loop by mistake. >> doSomething( >> par1, >> do { >> // compute result... >> break resut; >> }, >> par3 >> ); > Expanding on e-do... It could be a building block for e-switch. Remi is > advocating for expression-only case(s) in e-switch. Combined with e-do, we > could write: > int y = switch (x) { > case 1 -> 2; > case 2 -> 3; > case 3 -> do { > r = ...; > break r; > }; > }; > What we loose here is fallthrough. And we still have "break" here too. > It's unfortunate that we couldn't find a way to have an e-{block} of a kind when > lambdas have been conceived. That way we could get away with expression-only > lambdas and expression-only cases in e-switch using the same building block. > But I guess the statement lambda has its weight with its "return" sticking out > and reminding us about the scope. > Regards, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 16 12:43:11 2018 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 16 Mar 2018 12:43:11 +0000 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: <9375d004-fe73-7f71-ebcf-8daf23a4bac2@oracle.com> Hi Peter, John, I think I found your arguments convincing, both from a technical point of view - the stack analogy was a good one, and from a programming model point of view - declaration vs. block of code. I believe these are good basis for why we should break away from what we have chosen to do for lambdas. Maurizio On 16/03/18 08:50, Peter Levart wrote: > Yes, and my intuitive distinction between those two kinds of > constructs is that the first are just "declarations" of code blobs, > while the second are code blobs that execute in-line with the > surrounding code. It is therefore very intuitive for me to have two > kinds of syntax for exiting the constructs - "return" for the first > and "break" for the second. From brian.goetz at oracle.com Fri Mar 16 13:51:20 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 09:51:20 -0400 Subject: break seen as a C archaism In-Reply-To: References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> Message-ID: <2822f831-09c8-046e-9acc-7ff0151923e2@oracle.com> > > I don't know why others find break so archaic. When I 1st saw this > proposal, I thought that break was very intuitive choice for e-switch. I think this is mostly an emotional reaction.? There are plenty of things to dislike about switch in Java; I think that for some, the prospect of switch getting an overhaul but not "fixing" the things about it you hate most, feels like a slap in the face.? (I think there's also a bit of wishful thinking, that if we had a new shiny expression, we'd forget that it is almost but not quite like this existing construct that we can almost but not quite forget about.) It's like when you get an upgrade to a software package I use a lot, I see they've changed all the cosmetic stuff, but the annoying behaviors are still there, and you wonder, "what was the point of that upgrade?"? But, the goal was not to fix switch, as much as extend it to do new things, and that means the things it already did, it should keep doing them that way. As a reminder: while switch expressions are great (which is why we factored them out of the larger pattern effort), they are the "opportunistic" feature here; the real story is pattern matching (whose first target will be algebraic data types -- records and sealed classes). > Speaking of long returns... > > If return was used for "yielding" a result from e-switch, how is one > supposed to do a return from a method inside the e-switch: > > int m(int x) { > ??? int y = switch (x) { > ??? ??? case 1: return 12; // I want to return from m() here! > ??? } > } Not allowed.? A switch expression, like a conditional expression (or any expression, for that matter), must yield a value or throw.? It can't do nonlocal control flow into enclosing contexts, except throwing. > And now for something completely different... > > I think that with introduction of e-switch, we might soon see it being > "abused" for things like: > > doSomething( > ??? par1, > ??? switch (1) { case 1: > ??? ??? // compute result... > ?? ??? break resut; > ??? }, > ??? par3 > ); > > Are there any plans for such construct "without the boilerplate" ?-) Yes, switch expressions can be abused to become block expressions; switch (0) { default: s; s; break e; } becomes a block expression. That's ugly enough that maybe it will discourage people from doing this.... We don't have plans to fix it, but we're open to ideas; a more natural way to treat blocks of statements like: ??? var x = new Foo(); ??? x.setA(3); ??? return x; as an expression would have removed a lot of the angst over this feature.? Maybe it could be written ??? with (new Foo()) { setA(3); } or something.? But, its not the top priority right now. > Among existing reserved words, "do" seems most appropriate: Yes, do { } could become an expression with "break e".? (And maybe even make the while optional.) From guy.steele at oracle.com Fri Mar 16 14:59:39 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 16 Mar 2018 10:59:39 -0400 Subject: break seen as a C archaism In-Reply-To: <5e222490-5865-1231-be7c-df27e1739518@gmail.com> References: <902821304.2225181.1520630358844.JavaMail.zimbra@u-pem.fr> <746FE007-CD7D-42C9-8209-1FC9D0EBBFDF@oracle.com> <6c3d85a0-3c42-3712-baf3-8b2e244e139b@oracle.com> <5e222490-5865-1231-be7c-df27e173951! 8@gmail.com> Message-ID: <5143285B-174F-44EB-AF08-EABA8BF20671@oracle.com> > On Mar 16, 2018, at 5:39 AM, Peter Levart wrote: > > Expanding on do? Well, as long as we are fantasizing: > On 03/16/18 09:50, Peter Levart wrote: >> And if "while (false)" could be optional, we get: > > Or better yet, make "while (true)" optional even in statement do, so we can finally get away with some more boilerplate: > > for (;;) { > } > > or > > while (true) { > } > > and simply do: > > do { > } > > For e-do, the choice of default "while (true)" is fine, because it aligns with the fact that there has to be a break somewhere to exit it anyway because it has to yield a result. But there will be some danger that a programmer codes an infinite loop by mistake. This idea of making ?while(true)? be the default has made me realize that Java has excellent syntactic space available to support a style of programming that was mildly popular in the late 1970s and early 1980s: Dijkstra?s ?guarded commands? as described in his 1976 book ?A Discipline of Programming?. It?s a beautifully concise and symmetric theory of control structure, provided that you have committed to a style of programming entirely centered around assignment statements (which explains why it has largely fallen by the wayside). Consider the following tale a trip down memory lane to an alternate universe: ______________________________________________________________________________ Let a guarded command in Java have this form: case booleanExpression: statement* (Dijkstra used the form ?booleanExpression -> statement*?. It?s more Java-like to use the keyword `case` and a colon.) ______________________________________________________________________________ First, we define a statement form of do: do { * } The semantics are: evaluate the guards of all the guarded commands; if none is true, then terminate, else nondeterministically choose some command whose guard was true, execute its statements, and then repeat this process. It can be syntactically distinguished from the existing `do` statement in two ways: it contains guarded commands rather than simple statements, and there is no `while (expression)` at the end. Example: { int j = 0; do { case j < n: a[j] = f(j); ++j; } } ______________________________________________________________________________ Corresponding to that is a statement form of `if`: if { * } The semantics are: evaluate the guards of all the guarded commands; if none is true, then abort (program error), else nondeterministically choose some command whose guard was true and execute its statements. It can be syntactically distinguished from the existing `if` statement because there is a left-brace after the keyword `if`. Example: if { case x <= 0: a = -b; case x >= 0: a = b; } ______________________________________________________________________________ Now, to make expression forms of these puppies (something Dijkstra never envisioned), we need a way to yield values. For an `if`, we just use `break`: int a = if { case x <= 0: break -b; case x >= 0: break b; }; ______________________________________________________________________________ In the same manner as for `switch`, we can abbreviate `: break` as `->`: int a = if { case x <= 0 -> -b; case x >= 0 -> b; }; This feels very much like an expression switch, but with choices made by boolean expressions rather than a switch value. ______________________________________________________________________________ Now, the expression form of `do` is a bit trickier, because the `do` terminates only when no guard is true, so the return value cannot be specified in any of the contained statements. The ?obvious? answer is: do { * } break expression and at this point everyone else on this mailing list breaks out the torches and pitchforks to come after me. :-) ______________________________________________________________________________ ?Guy P.S. ?Did you succeed in using your time machine to get guarded commands into Java?? ?Yes, but in a past that hasn?t happened yet?it?s complicated.? (Tip of the hat to Norm Feuti.) From brian.goetz at oracle.com Fri Mar 16 16:59:13 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 12:59:13 -0400 Subject: Switch expressions -- next steps Message-ID: Without cutting off useful discussions in progress, it seems to me that we've reached a point where the design decisions for JEP 325 are largely stabilized and there's a prototype available that implements this design point.? (Much of the recent discussion, especially on nulls and exhaustiveness, is more relevant to the larger effort to support patterns in switch, though some of it projects into the subset specified by JEP 325.)? As a reminder, our motivation for splitting JEP 325 off from the larger effort is that (a) it is useful independent of patterns and (b) constitutes a stable, smaller piece we can deliver to users sooner. Our next steps are to update the JEP document to reflect recent decisions and clarify the scope of the feature, and get a draft specification that captures the design center we are circling. (I'll also try to get a draft of the current status of the broader pattern matching effort out soon.) It would be great if people could try out the prototype on their codebases and report any feedback! For the IDE folks here, we'd welcome any preliminary support for the feature, as it seems to have stabilized to the point where it can be prototyped effectively. From brian.goetz at oracle.com Fri Mar 16 18:55:19 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 14:55:19 -0400 Subject: Records -- current status Message-ID: There are a number of potentially open details on the design for records.? My inclination is to start with the simplest thing that preserves the flexibility and expectations we want, and consider opening up later as necessary. One of the biggest issues, which Kevin raised as a must-address issue, is having sufficient support for precondition validation. Without foreclosing on the ability to do more later with declarative guards, I think the recent construction proposal meets the requirement for lightweight enforcement with minimal or no duplication.? I'm hopeful that this bit is "there". Our goal all along has been to define records as being ?just macros? for a finer-grained set of features.? Some of these are motivated by boilerplate; some are motivated by semantics (coupling semantics of API elements to state.)? In general, records will get there first, and then ordinary classes will get the more general feature, but the default answer for "can you relax records, so I can use it in this case that almost but doesn't quite fit" should be "no, but there will probably be a feature coming that makes that class simpler, wait for that." Some other open issues (please see my writeup at http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), and my current thoughts on these, are outlined below. Comments welcome! ?- Extension.? The proposal outlines a notion of abstract record, which provides a "width subtyped" hierarchy.? Some have questioned whether this carries its weight, especially given how Scala doesn't support case-to-case extension (some see this as a bug, others as an existence proof.)? Records can implement interfaces. ?- Concrete records are final.? Relaxing this adds complexity to the equality story; I'm not seeing good reasons to do so. ?- Additional constructors.? I don't see any reason why additional constructors are problematic, especially if they are constrained to delegate to the default constructor (which in turn is made far simpler if there can be statements ahead of the this() call.) Users may find the lack of additional constructors to be an arbitrary limitation (and they'd probably be right.) ?- Static fields.? Static fields seem harmless. ?- Additional instance fields.? These are a much bigger concern. While the primary arguments against them are of the "slippery slope" variety, I still have deep misgivings about supporting unrestricted non-principal instance fields, and I also haven't found a reasonable set of restrictions that makes this less risky.? I'd like to keep looking for a better story here, before just caving on this, as I worry doing so will end up biting us in the back. ?- Mutability and accessibility.? I'd like to propose an odd choice here, which is: fields are final and package (protected for abstract records) by default, but finality can be explicitly opted out of (non-final) and accessibility can be explicitly widened (public). ?- Accessors.? Perhaps the most controversial aspect is that records are inherently transparent to read; if something wants to truly encapsulate state, it's not a record.? Records will eventually have pattern deconstructors, which will expose their state, so we should go out of the gate with the equivalent.? The obvious choice is to expose read accessors automatically.? (These will not be named getXxx; we are not burning the ill-advised Javabean naming conventions into the language, no matter how much people think it already is.)? The obvious naming choice for these accessors is fieldName().? No provision for write accessors; that's bring-your-own. ?- Core methods.? Records will get equals, hashCode, and toString.? There's a good argument for making equals/hashCode final (so they can't be explicitly redeclared); this gives us stronger preservation of the data invariants that allow us to safely and mechanically snapshot / serialize / marshal (we'd definitely want this if we ever allowed additional instance fields.)? No reason to suppress override of toString, though. Records could be safely made cloneable() with automatic support too (like arrays), but not clear if this is worth it (its darn useful for arrays, though.)? I think the auto-generated getters should be final too; this leaves arrays as second-class components, but I am not sure that bothers me. From forax at univ-mlv.fr Fri Mar 16 20:45:53 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 16 Mar 2018 21:45:53 +0100 (CET) Subject: Records -- current status In-Reply-To: References: Message-ID: <1214486812.2461440.1521233153357.JavaMail.zimbra@u-pem.fr> Hi Brian, ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Vendredi 16 Mars 2018 19:55:19 > Objet: Records -- current status > There are a number of potentially open details on the design for > records.? My inclination is to start with the simplest thing that > preserves the flexibility and expectations we want, and consider opening > up later as necessary. > > One of the biggest issues, which Kevin raised as a must-address issue, > is having sufficient support for precondition validation. Without > foreclosing on the ability to do more later with declarative guards, I > think the recent construction proposal meets the requirement for > lightweight enforcement with minimal or no duplication.? I'm hopeful > that this bit is "there". I agree, having the user write their own write accessor is a fine idea. Users will still have to write the doc manually, and maintains the encapsulation correctly but in term of design, this allow us to move forward by separating the concept of record and the concept of precondition guard. Kudo to you for that ! > > Our goal all along has been to define records as being ?just macros? for > a finer-grained set of features.? Some of these are motivated by > boilerplate; some are motivated by semantics (coupling semantics of API > elements to state.)? In general, records will get there first, and then > ordinary classes will get the more general feature, but the default > answer for "can you relax records, so I can use it in this case that > almost but doesn't quite fit" should be "no, but there will probably be > a feature coming that makes that class simpler, wait for that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), > and my current thoughts on these, are outlined below. Comments welcome! > > ?- Extension.? The proposal outlines a notion of abstract record, which > provides a "width subtyped" hierarchy.? Some have questioned whether > this carries its weight, especially given how Scala doesn't support > case-to-case extension (some see this as a bug, others as an existence > proof.)? Records can implement interfaces. Kotlin does not support abstract data class too. And we can still add abstract record later. > > ?- Concrete records are final.? Relaxing this adds complexity to the > equality story; I'm not seeing good reasons to do so. i fully agree. > > ?- Additional constructors.? I don't see any reason why additional > constructors are problematic, especially if they are constrained to > delegate to the default constructor (which in turn is made far simpler > if there can be statements ahead of the this() call.) Users may find the > lack of additional constructors to be an arbitrary limitation (and > they'd probably be right.) i agree. > > ?- Static fields.? Static fields seem harmless. a static field is never harmless, eager initialization and being a root for the GC can make them dangerous, but we should support them. > > ?- Additional instance fields.? These are a much bigger concern. While > the primary arguments against them are of the "slippery slope" variety, > I still have deep misgivings about supporting unrestricted non-principal > instance fields, and I also haven't found a reasonable set of > restrictions that makes this less risky.? I'd like to keep looking for a > better story here, before just caving on this, as I worry doing so will > end up biting us in the back. data class <=> no hidden states, so no secondary instance fields. And again, we can add them later, if needed. > > ?- Mutability and accessibility.? I'd like to propose an odd choice > here, which is: fields are final and package (protected for abstract > records) by default, but finality can be explicitly opted out of > (non-final) and accessibility can be explicitly widened (public). I think that record fields should be private thus only visible by the nestmates by default. I will have agree about being package visible by default before the introduction of nestmates, but now that we have nestmates, i do not see the need for having an unrelated class in the same package to see the implementation of a record. Having the field protected for abstract record is coherent with the fact that abstract record show too much detail of implementation and should not exist. > > ?- Accessors.? Perhaps the most controversial aspect is that records > are inherently transparent to read; if something wants to truly > encapsulate state, it's not a record.? Records will eventually have > pattern deconstructors, which will expose their state, so we should go > out of the gate with the equivalent.? The obvious choice is to expose > read accessors automatically.? (These will not be named getXxx; we are > not burning the ill-advised Javabean naming conventions into the > language, no matter how much people think it already is.)? The obvious > naming choice for these accessors is fieldName().? No provision for > write accessors; that's bring-your-own. Again, no write accessor by default is a clever idea ! > > ?- Core methods.? Records will get equals, hashCode, and toString. > There's a good argument for making equals/hashCode final (so they can't > be explicitly redeclared); this gives us stronger preservation of the > data invariants that allow us to safely and mechanically snapshot / > serialize / marshal (we'd definitely want this if we ever allowed > additional instance fields.)? No reason to suppress override of > toString, though. Records could be safely made cloneable() with > automatic support too (like arrays), but not clear if this is worth it > (its darn useful for arrays, though.)? I think the auto-generated > getters should be final too; this leaves arrays as second-class > components, but I am not sure that bothers me. i am lost here, if the record is final thus all the methods are final, no ?? Overall, i think this proposed design is sound and great. I've no objection to its inclusion in a future release. cheers, R?mi From brian.goetz at oracle.com Fri Mar 16 20:53:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 16:53:16 -0400 Subject: Records -- current status In-Reply-To: <1214486812.2461440.1521233153357.JavaMail.zimbra@u-pem.fr> References: <1214486812.2461440.1521233153357.JavaMail.zimbra@u-pem.fr> Message-ID: <33f58fe0-6863-0651-33f6-1db0dffa8838@oracle.com> On 3/16/2018 4:45 PM, Remi Forax wrote: >> ?- Mutability and accessibility.? I'd like to propose an odd choice >> here, which is: fields are final and package (protected for abstract >> records) by default, but finality can be explicitly opted out of >> (non-final) and accessibility can be explicitly widened (public). > I think that record fields should be private thus only visible by the nestmates by default. > I will have agree about being package visible by default before the introduction of nestmates, but now that we have nestmates, i do not see the need for having an unrelated class in the same package to see the implementation of a record. The motivation for "package" is that this is the default in classes, so it would be one less thing that is different about records. Minimizing the differences between records and classes facilitates refactoring back and forth.? (Both refactoring directions are valuable; existing classes can be refactored to records to squeeze away the low-value code, but at some point, a record may cross the boundary of what records can do, and have to be refactored to a class (just as enums sometimes hit their limits and have to be refactored away.)? Minimizing the skew here helps. > Having the field protected for abstract record is coherent with the fact that abstract record show too much detail of implementation and should not exist. On further thought, these can be package or private too, since the subclass can call the getter. > ?- Core methods.? Records will get equals, hashCode, and toString. >> There's a good argument for making equals/hashCode final (so they can't >> be explicitly redeclared); this gives us stronger preservation of the >> data invariants that allow us to safely and mechanically snapshot / >> serialize / marshal (we'd definitely want this if we ever allowed >> additional instance fields.)? No reason to suppress override of >> toString, though. Records could be safely made cloneable() with >> automatic support too (like arrays), but not clear if this is worth it >> (its darn useful for arrays, though.)? I think the auto-generated >> getters should be final too; this leaves arrays as second-class >> components, but I am not sure that bothers me. > i am lost here, if the record is final thus all the methods are final, no ?? > So, if there is an implicit member implementation (like equals()), then an explicit implementation is like an override (even though its in the same class).? So in this context, final means not only "subclass may not override", but also means "record itself cannot provide explicit implementation." -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 16 21:14:02 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 16 Mar 2018 14:14:02 -0700 Subject: Records -- current status In-Reply-To: References: Message-ID: On Fri, Mar 16, 2018 at 11:55 AM, Brian Goetz wrote: There are a number of potentially open details on the design for records. > My inclination is to start with the simplest thing that preserves the > flexibility and expectations we want, and consider opening up later as > necessary. > > One of the biggest issues, which Kevin raised as a must-address issue, is > having sufficient support for precondition validation. Without foreclosing > on the ability to do more later with declarative guards, I think the recent > construction proposal meets the requirement for lightweight enforcement > with minimal or no duplication. I'm hopeful that this bit is "there". > Agreed. Even if we had solved declarative guards, we'd still benefit from what you're doing here when we need a defensive copy etc. > Our goal all along has been to define records as being ?just macros? for a > finer-grained set of features. Some of these are motivated by boilerplate; > some are motivated by semantics (coupling semantics of API elements to > state.) In general, records will get there first, and then ordinary > classes will get the more general feature, but the default answer for "can > you relax records, so I can use it in this case that almost but doesn't > quite fit" should be "no, but there will probably be a feature coming that > makes that class simpler, wait for that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), > and my current thoughts on these, are outlined below. Comments welcome! > > - Extension. The proposal outlines a notion of abstract record, which > provides a "width subtyped" hierarchy. Some have questioned whether this > carries its weight, especially given how Scala doesn't support case-to-case > extension (some see this as a bug, others as an existence proof.) Records > can implement interfaces. > I also suggest we avoid abstract records. A reference to one may seem like a proper record but it will behave badly with regard to equals(). I don't see the upside compared to a common interface, and then you don't have to have the novel parameterized extends clause. - Concrete records are final. Relaxing this adds complexity to the > equality story; I'm not seeing good reasons to do so. > Absolutely. > - Additional instance fields. These are a much bigger concern. While the > primary arguments against them are of the "slippery slope" variety, I still > have deep misgivings about supporting unrestricted non-principal instance > fields, and I also haven't found a reasonable set of restrictions that > makes this less risky. I'd like to keep looking for a better story here, > before just caving on this, as I worry doing so will end up biting us in > the back. > Lazy-initialized derived values are common enough. I'm not grasping what there is to be afraid of here - I thought that preventing custom eq/hc addressed the concerns. I understand the spirit of "start restrictive and open up later", but I think we should still have some halfway convincing explanation of why this was worth worrying about. > - Mutability and accessibility. I'd like to propose an odd choice here, > which is: fields are final and package (protected for abstract records) by > default, but finality can be explicitly opted out of (non-final) and > accessibility can be explicitly widened (public). > I agree that field accessibility should play by the normal rules. On the other hand. As much as I want everyone to stick to immutable records as much as possible, it seems very costly to me to have to introduce a new keyword for "not final", and have users keep track of which things have which defaults. Let this just be "best practice", like it already is for regular fields (make them final unless you have good reason not to). - Accessors. Perhaps the most controversial aspect is that records are > inherently transparent to read; if something wants to truly encapsulate > state, it's not a record. Records will eventually have pattern > deconstructors, which will expose their state, so we should go out of the > gate with the equivalent. The obvious choice is to expose read accessors > automatically. (These will not be named getXxx; we are not burning the > ill-advised Javabean naming conventions into the language, no matter how > much people think it already is.) The obvious naming choice for these > accessors is fieldName(). No provision for write accessors; that's > bring-your-own. > Method and field named identically is a slight concern. If we gain field references using the same syntax as method references there would probably be no way to refer to such a field. I'm pretty sure this is not worth worrying about though. - Core methods. Records will get equals, hashCode, and toString. There's > a good argument for making equals/hashCode final (so they can't be > explicitly redeclared); this gives us stronger preservation of the data > invariants that allow us to safely and mechanically snapshot / serialize / > marshal (we'd definitely want this if we ever allowed additional instance > fields.) No reason to suppress override of toString, though. Agree with all this. > Records could be safely made cloneable() with automatic support too (like > arrays), but not clear if this is worth it (its darn useful for arrays, > though.) > People just really need to not use arrays anymore, and especially not with records. imho we should have added immutable List and ImmutableIntArray etc. classes a very long time ago. I know we won't now due to our value type aspirations. In the meantime we're in a weird place. Arrays are completely terrible except as micro-optimizations to be used with great care. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 16 21:28:37 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 17:28:37 -0400 Subject: Records -- current status In-Reply-To: References: Message-ID: > > ?- Extension.? The proposal outlines a notion of abstract record, > which provides a "width subtyped" hierarchy.? Some have questioned > whether this carries its weight, especially given how Scala > doesn't support case-to-case extension (some see this as a bug, > others as an existence proof.)? Records can implement interfaces. > > > I also suggest we avoid abstract records. A reference to one may seem > like a proper record but it will behave badly with regard to equals(). > I don't see the upside compared to a common interface, and then you > don't have to have the novel parameterized extends clause. The hackneyed example: consider the below as a subset of a typical "model an expression with a tree" hierarchy.? Of course, a real hierarchy would have a lot more classes. ??? sealed interface Node; ??? record ValNode(int value) extends Node; ??? record VarNode(String name) extends Node; ??? abstract record BinOpNode(Node left, Node right) extends Node; ??? record PlusNode(Node left, Node right) extends BinOpNode(left, right); ??? record MulNode(Node left, Node right) extends BinOpNode(left, right); Obviously there might be some common behavior for binary operation nodes that can be factored up into BinOpNode.? But also, there are times when matching against the abstract type makes sense too.? For example, if you want to traverse the tree and perform structural operations (say, detect if a tree contains a reference to the variable "x"), matching on abstract records is pretty useful: ??? boolean containsVar(Node node, String name) { ??????? return switch (node) { ??????????? case VarNode(String s) -> s.equals(name); ??????????? case BinOpNode(var left, var right) -> containsVar(left, name) || containsVar(right, name); ??????????? default -> false; ??????? } ??? } A client who is only interested in structural properties can match once on the abstract type, instead of matching explicitly on N effectively identical cases (and add more every time the hierarchy changes.) > > On the other hand. As much as I want everyone to stick to immutable > records as much as possible, it seems very costly to me to have to > introduce a new keyword for "not final", and have users keep track of > which things have which defaults. Let this just be "best practice", > like it already is for regular fields (make them final unless you have > good reason not to). Pretend we already had non-final.? Does that change your inclination?? (When we do sealed types, we're likely going to need a way to say non-sealed anyway.) > > ?- Accessors.? Perhaps the most controversial aspect is that > records are inherently transparent to read; if something wants to > truly encapsulate state, it's not a record. Records will > eventually have pattern deconstructors, which will expose their > state, so we should go out of the gate with the equivalent.? The > obvious choice is to expose read accessors automatically.? (These > will not be named getXxx; we are not burning the ill-advised > Javabean naming conventions into the language, no matter how much > people think it already is.)? The obvious naming choice for these > accessors is fieldName().? No provision for write accessors; > that's bring-your-own. > > > Method and field named identically is a slight concern. If we gain > field references using the same syntax as method references there > would probably be no way to refer to such a field. I'm pretty sure > this is not worth worrying about though. I have a story for disambiguating field references... > Records could be safely made cloneable() with automatic support > too (like arrays), but not clear if this is worth it (its darn > useful for arrays, though.) > > > People just really need to not use arrays anymore, and especially not > with records. imho we should have added immutable List and > ImmutableIntArray etc. classes a very long time ago. I know we won't > now due to our value type aspirations. In the meantime we're in a > weird place. Arrays are completely terrible except as > micro-optimizations to be used with great care. > OK, but do? you have an opinion on whether records should automatically acquire a clone() implementation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 16 21:59:47 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 16 Mar 2018 14:59:47 -0700 Subject: Records -- current status In-Reply-To: References: Message-ID: On Fri, Mar 16, 2018 at 2:28 PM, Brian Goetz wrote: > > But also, there are times when matching against the abstract type makes > sense too. For example, if you want to traverse the tree and perform > structural operations (say, detect if a tree contains a reference to the > variable "x"), matching on abstract records is pretty useful: > > boolean containsVar(Node node, String name) { > return switch (node) { > case VarNode(String s) -> s.equals(name); > case BinOpNode(var left, var right) -> containsVar(left, name) > || containsVar(right, name); > default -> false; > } > } > Am I correct that if BinOpNode is an interface there will be a way for it to specify how it destructures so that it can get this effect also - and it's just that records are neat because they know how to destructure for free? We want people to be solid on the fact that two records with all the same field values are always equals(), and then they may apply that view to an abstract record type where it doesn't hold true. On the other hand. As much as I want everyone to stick to immutable records > as much as possible, it seems very costly to me to have to introduce a new > keyword for "not final", and have users keep track of which things have > which defaults. Let this just be "best practice", like it already is for > regular fields (make them final unless you have good reason not to). > > > Pretend we already had non-final. Does that change your inclination? > I don't think so? The reversed default behavior feels like arbitrary difference from regular fields (again, I do *want* to encourage finalness of record fields...). Would we permit the "not final" keyword on interface fields too? Records could be safely made cloneable() with automatic support too (like >> arrays), but not clear if this is worth it (its darn useful for arrays, >> though.) >> > > People just really need to not use arrays anymore, and especially not with > records. imho we should have added immutable List and ImmutableIntArray > etc. classes a very long time ago. I know we won't now due to our value > type aspirations. In the meantime we're in a weird place. Arrays are > completely terrible except as micro-optimizations to be used with great > care. > > OK, but do you have an opinion on whether records should automatically > acquire a clone() implementation? > As much as possible we'll encourage all-final, array-free records that have no need to be cloned, but some number of records will go against that, and I guess it's better that they have clone() than that they don't. But my concern is: What does it do -- deep-clone arrays but shallow-clone everything else? Sounds problematic no matter which way you decide it. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 16 22:09:57 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 16 Mar 2018 18:09:57 -0400 Subject: Records -- current status In-Reply-To: References: Message-ID: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> On 3/16/2018 5:59 PM, Kevin Bourrillion wrote: > On Fri, Mar 16, 2018 at 2:28 PM, Brian Goetz > wrote: > > > But also, there are times when matching against the abstract type > makes sense too.? For example, if you want to traverse the tree > and perform structural operations (say, detect if a tree contains > a reference to the variable "x"), matching on abstract records is > pretty useful: > > ??? boolean containsVar(Node node, String name) { > ??????? return switch (node) { > ??????????? case VarNode(String s) -> s.equals(name); > ??????????? case BinOpNode(var left, var right) -> > containsVar(left, name) || containsVar(right, name); > ??????????? default -> false; > ??????? } > ??? } > > > Am I correct that if BinOpNode is an interface there will be a way for > it to specify how it destructures so that it can get this effect also > - and it's just that records are neat because they know how to > destructure for free? Destructuring for free is important, but it's not just destructuring -- it's all the stuff.? It means that the fields and accessors (and therefore, any methods derivable from that state that is common to all subtypes) get pulled into the abstract record too.? Remember, records can have behavior that is derived from their state.? So if there is any behavior that is natural on a BinaryOpNode, to put it there, it needs to have its state (or at least state accessors) there. > We want people to be solid on the fact that two records with all the > same field values are always equals(), and then they may apply that > view to an abstract record type where it doesn't hold true. equals() on abstract records is abstract; only the concrete record gets to declare equals. > > Pretend we already had non-final.? Does that change your inclination? > > I don't think so? The reversed default behavior feels like arbitrary > difference from regular fields (again, I do /want/?to encourage > finalness of record fields...). Would we permit the "not final" > keyword on interface fields too? Hadn't thought about that, but, assuming we didn't think that was a bad idea, yes, we surely could do that.? (In other words; interface fields should be final because we think its dumb for them to be mutable, not because we don't have a way to spell it.) I am usually wary of "let's flip the default on this new thing because we can" arguments.? This seems one of the few places where we could really get away with it, so I want to consider it seriously.? If we think its a bad idea, I'm OK with ultimately saying "nah, its like classes."? But I don't want to skip over that deliberation, and certainly not for a silly reason like "but we can't spell it"! > OK, but do? you have an opinion on whether records should > automatically acquire a clone() implementation? > > As much as possible we'll encourage all-final, array-free records that > have no need to be cloned, but some number of records will go against > that, and I guess it's better that they have clone() than that they > don't. But my concern is: What does it do -- deep-clone arrays but > shallow-clone everything else? Sounds problematic no matter which way > you decide it. > Yes, that's the question.? One possibility is just to always clone shallowly; this is not as dumb as it sounds, since the fields are already exposed for read, and therefore any deep mutability is already flapping in the wind. The primary value of cloning would probably be taking snapshots of mutable things like statistics-gathering records, which are a related bag of mutable values, and sometimes you want a consistent snapshot. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 16 22:36:53 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 16 Mar 2018 15:36:53 -0700 Subject: Records -- current status In-Reply-To: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: On Fri, Mar 16, 2018 at 3:09 PM, Brian Goetz wrote: On 3/16/2018 5:59 PM, Kevin Bourrillion wrote: > > On Fri, Mar 16, 2018 at 2:28 PM, Brian Goetz > wrote: > >> >> But also, there are times when matching against the abstract type makes >> sense too. For example, if you want to traverse the tree and perform >> structural operations (say, detect if a tree contains a reference to the >> variable "x"), matching on abstract records is pretty useful: >> >> boolean containsVar(Node node, String name) { >> return switch (node) { >> case VarNode(String s) -> s.equals(name); >> case BinOpNode(var left, var right) -> containsVar(left, >> name) || containsVar(right, name); >> default -> false; >> } >> } >> > > Am I correct that if BinOpNode is an interface there will be a way for it > to specify how it destructures so that it can get this effect also - and > it's just that records are neat because they know how to destructure for > free? > > > Destructuring for free is important, but it's not just destructuring -- > it's all the stuff. It means that the fields and accessors (and therefore, > any methods derivable from that state that is common to all subtypes) get > pulled into the abstract record too. Remember, records can have behavior > that is derived from their state. So if there is any behavior that is > natural on a BinaryOpNode, to put it there, it needs to have its state (or > at least state accessors) there. > Sure, I was assuming you have to put the accessors explicitly on the BinOpNode interface, which is a bit more cumbersome than getting to use record syntax, but only a bit. What else will go wrong? Note: I've been curious what explicit destructuring is expected to look like. We want people to be solid on the fact that two records with all the same > field values are always equals(), and then they may apply that view to an > abstract record type where it doesn't hold true. > > equals() on abstract records is abstract; only the concrete record gets to > declare equals. > *Eppur si muove...* what I mean is, nevertheless equals() is usable and will sometimes return a `false` that may be surprising. Yeah, it's not fundamentally different from the `Collection` problem, and yeah, I do think we can probably live with it; it's just not "free". Pretend we already had non-final. Does that change your inclination? > > I don't think so? The reversed default behavior feels like arbitrary > difference from regular fields (again, I do *want* to encourage finalness > of record fields...). Would we permit the "not final" keyword on interface > fields too? > > > Hadn't thought about that, but, assuming we didn't think that was a bad > idea, yes, we surely could do that. (In other words; interface fields > should be final because we think its dumb for them to be mutable, not > because we don't have a way to spell it.) > Hadn't thought about it either. I decided to think about it just now and, oh, it's a completely *disgusting* idea, because it'd still be implicitly static, mutable statics are The Worst, plus you could modify it from default methods without even realizing it's not instance state. So add all this up and we have *three* kind of finalness for fields: - by default mutable, but you can change it - by default final, and you can't change it - (and now) by default final, but you can change it This seems like quite a bad situation to me. > I am usually wary of "let's flip the default on this new thing because we > can" arguments. This seems one of the few places where we could really get > away with it, so I want to consider it seriously. If we think its a bad > idea, I'm OK with ultimately saying "nah, its like classes." But I don't > want to skip over that deliberation, and certainly not for a silly reason > like "but we can't spell it"! > > OK, but do you have an opinion on whether records should automatically > acquire a clone() implementation? > > As much as possible we'll encourage all-final, array-free records that > have no need to be cloned, but some number of records will go against that, > and I guess it's better that they have clone() than that they don't. But my > concern is: What does it do -- deep-clone arrays but shallow-clone > everything else? Sounds problematic no matter which way you decide it. > > Yes, that's the question. One possibility is just to always clone > shallowly; this is not as dumb as it sounds, since the fields are already > exposed for read, and therefore any deep mutability is already flapping in > the wind. > Okay, I guess that's the right move because you kinda want `record.clone().equals(record)`. But a user's assumption that `record.clone()` would deep-clone the array might be the entire reason they're using clone() at all. Oh well, it's not like we'd be making arrays awful for the first time. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vicente.romero at oracle.com Mon Mar 12 18:45:25 2018 From: vicente.romero at oracle.com (Vicente Romero) Date: Mon, 12 Mar 2018 14:45:25 -0400 Subject: Records: construction and validation In-Reply-To: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> References: <9ba9752c-d595-0021-6205-cff020443558@oracle.com> Message-ID: <7e58695e-a6c2-132f-e5f9-c373282a365a@oracle.com> On 03/12/2018 01:48 PM, Brian Goetz wrote: > Here's a sketch of where our thinking is right now for construction > and validation. > > General goal: As Kevin pointed out, we should make adding incremental > validation easy, otherwise people won't do it, and the result is worse > code.? It should be simple to add validation (and possibly also > normalization) logic to constructors without falling off the syntactic > cliff, either in the declaration or the body of the constructor. > > All records have a /default constructor/.? This is the one whose > signature matches the class signature.? If you don't have an explicit > one, you get an implicit one, regardless of whether or not there are > other constructors. > > If you have records: > > ??? abstract record A(int a) { } > ??? record B(int a, int b) extends A(a) { } > > then the behavior of the default constructor for B is: > > ??? super(a); > ??? this.b = b; > > If you want to provide an explicit constructor to ensure, for example, > that b > 0, you could just say it yourself: > > ??? public B(int a, int b) { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? super(a); > ??????? this.b = b; > ??? } > > Wait, wait a second...? I thought we couldn't put statements ahead of > the super-call? > > DIGRESSION... > > Historically, this() or super() must be first in a constructor. This > restriction was never popular, and perceived as arbitrary. There were > a number of subtle reasons, including the verification of > invokespecial, that contributed to this restriction.? Over the years, > we've addressed these at the VM level, to the point where it becomes > practical to consider lifting this restriction, not just for records, > but for all constructors. > > Currently a constructor follows a production like: > > ??? [ explicit-ctor-invocation ] statement* > > We can extend this to be: > > ??? statement* [ explicit-ctor-invocation statement* ] > > and treat `this` as DU in the statements in the first block. > > ...END DIGRESSION > > OK, so we can put a statement ahead of the super-call.? But this > explicit declaration is awfully verbose.? We can trim this by: > ?- Allow the compiler to infer the signature for the default > constructor, if none is provided; > ?- Provide a shorthand for "just do the default initialization". > > Now we get: > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? default.this(a, b); > ??? } > > There's still some repetition here; it would be nice if the default > initialization were inferred as well.? Which leads to a question: if > we have a record constructor with no explicit constructor call, do we > do the default initialization at the beginning or the end?? In other > words, does this: > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??? } > > mean > > ??? public B { > ??????? if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??????? default.this(a, b); > ??? } > > or this: > > public B { > default.this(a, b); > if (b <= 0) > ??????????? throw new IllegalArgumentException("b"); > ??? } > > The two are subtly different, and the difference becomes apparent if > we want to normalize arguments or make defensive copies, not just > validate: > > public B { > ??????? if (b <= 0) > ??????????? b = 0; > ??? } > > If we put our implicit construction at the beginning, this would be a > dead assignment to the parameter, after the record was initialized, > which is almost certainly not what the user meant. If we put it at the > end, this would pick up the update.? The former seems pretty > error-prone, so the latter seems attractive. > > However, this runs into another issue, which is: what if we have > additional fields?? (We might disallow this, but we might not.) Now > what if we wanted to do: > > ??? record B(int a, int b) { > ??????? int cachedSum; > > ??????? B { > ??????????? cachedSum = a + b; > ??????? } > ??? } > > If we treat the explicit statements as occuring before the default > initialization, now `this` is DU at the point of assigning > `cachedSum`, and the compiler tells us that we can't do this.? Of > course, there's a workaround: > > B { > default.this(a, b); > cachedSum = a + b; > ??????? } > > which might be good enough. (Note that we'd like to be able to extend > this ability to constructors of classes other than records eventually, > so we should work out the construction protocol in generality even if > we're not going to do it all now.) > > Is `default.this(a, b)` still too verbose/general/error-prone? Would > some more invariant marker ("do the default thing now") be better, like: > > ??? B { > ??????? new; > this.cachedSum = a + b; > ??? } > > > > So, summarizing: > ?- We're OK with Foo { ... } as shorthand for the default constructor? > ?- Where should the implicit construction go -- beginning or end? I think that placing it at the beginning covers more useful cases: validation, caching, etc > ?- Should there be a better idiom other than default.this(args) for > "do the explicit construction now"? bikeshed: what about default(args)? > > Thanks, Vicente From forax at univ-mlv.fr Tue Mar 20 12:20:36 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 20 Mar 2018 13:20:36 +0100 (CET) Subject: Records -- current status In-Reply-To: <33f58fe0-6863-0651-33f6-1db0dffa8838@oracle.com> References: <1214486812.2461440.1521233153357.JavaMail.zimbra@u-pem.fr> <33f58fe0-6863-0651-33f6-1db0dffa8838@oracle.com> Message-ID: <650476135.931196.1521548436585.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Vendredi 16 Mars 2018 21:53:16 > Objet: Re: Records -- current status > On 3/16/2018 4:45 PM, Remi Forax wrote: >>> - Mutability and accessibility.? I'd like to propose an odd choice >>> here, which is: fields are final and package (protected for abstract >>> records) by default, but finality can be explicitly opted out of >>> (non-final) and accessibility can be explicitly widened (public). >> I think that record fields should be private thus only visible by the nestmates >> by default. >> I will have agree about being package visible by default before the introduction >> of nestmates, but now that we have nestmates, i do not see the need for having >> an unrelated class in the same package to see the implementation of a record. > The motivation for "package" is that this is the default in classes, so it would > be one less thing that is different about records. Minimizing the differences > between records and classes facilitates refactoring back and forth. (Both > refactoring directions are valuable; existing classes can be refactored to > records to squeeze away the low-value code, but at some point, a record may > cross the boundary of what records can do, and have to be refactored to a class > (just as enums sometimes hit their limits and have to be refactored away.) > Minimizing the skew here helps. Ok ! >> Having the field protected for abstract record is coherent with the fact that >> abstract record show too much detail of implementation and should not exist. > On further thought, these can be package or private too, since the subclass can > call the getter. yes ! >> - Core methods. Records will get equals, hashCode, and toString. >>> There's a good argument for making equals/hashCode final (so they can't >>> be explicitly redeclared); this gives us stronger preservation of the >>> data invariants that allow us to safely and mechanically snapshot / >>> serialize / marshal (we'd definitely want this if we ever allowed >>> additional instance fields.)? No reason to suppress override of >>> toString, though. Records could be safely made cloneable() with >>> automatic support too (like arrays), but not clear if this is worth it >>> (its darn useful for arrays, though.)? I think the auto-generated >>> getters should be final too; this leaves arrays as second-class >>> components, but I am not sure that bothers me. >> i am lost here, if the record is final thus all the methods are final, no ?? > So, if there is an implicit member implementation (like equals()), then an > explicit implementation is like an override (even though its in the same > class). So in this context, final means not only "subclass may not override", > but also means "record itself cannot provide explicit implementation." Providing an explicit implementation is easier to understand. I agree that equals and hashCode should not be user-defined, if you want user-defined equals/hashCode, it means you do not want a record. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Mar 20 12:26:12 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 20 Mar 2018 13:26:12 +0100 (CET) Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: <1493690698.932683.1521548772827.JavaMail.zimbra@u-pem.fr> > De: "Kevin Bourrillion" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Vendredi 16 Mars 2018 23:36:53 > Objet: Re: Records -- current status [...] >>> OK, but do you have an opinion on whether records should automatically acquire a >>> clone() implementation? >>> As much as possible we'll encourage all-final, array-free records that have no >>> need to be cloned, but some number of records will go against that, and I guess >>> it's better that they have clone() than that they don't. But my concern is: >>> What does it do -- deep-clone arrays but shallow-clone everything else? Sounds >>> problematic no matter which way you decide it. >> Yes, that's the question. One possibility is just to always clone shallowly; >> this is not as dumb as it sounds, since the fields are already exposed for >> read, and therefore any deep mutability is already flapping in the wind. > Okay, I guess that's the right move because you kinda want > `record.clone().equals(record)`. But a user's assumption that `record.clone()` > would deep-clone the array might be the entire reason they're using clone() at > all. Oh well, it's not like we'd be making arrays awful for the first time. There is no need to have a clone because clone == constructor(de-constructor()), And this is true for the serialization too, once you have a de-constructor and a constructor you can serialize the data in the middle, as you can do a shallow copy or a deep copy in the middle. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 20 13:33:34 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 09:33:34 -0400 Subject: Records -- current status In-Reply-To: <1493690698.932683.1521548772827.JavaMail.zimbra@u-pem.fr> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <1493690698.932683.1521548772827.JavaMail.zimbra@u-pem.fr> Message-ID: You can certainly use this to clone (that's what the clone() impl would do), but its not as convenient to do it manually. Using getters: ??? new Address(addr.first(), addr.last(), addr.street(), addr.city(), addr.postCode()) is more cumbersome than "addr.clone()".? Similarly: ??? let Addr(var f, var l, var s, var c, var p) = addr; ??? new Addr(f, l, s, c, p); is worse.? Both are repetitive, harder to read, and error-prone. The motivation for unleashing a parallel construction deconstruction (and jiggering the rules so that composing them is an identity) is what enables safe mechanical serialization of all forms (not just java.io serialization.)? But just because the tools are now in the users hands, doesn't mean we shouldn't give them help. I find clone very useful for arrays (covariant returns made clone() a lot more useful.)? I could imagine it being equally useful for records. On 3/20/2018 8:26 AM, Remi Forax wrote: > > There is no need to have a clone because clone == > constructor(de-constructor()), > And this is true for the serialization too, once you have a > de-constructor and a constructor you can serialize the data in the > middle, as you can do a shallow copy or a deep copy in the middle. > From forax at univ-mlv.fr Tue Mar 20 13:38:30 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 20 Mar 2018 14:38:30 +0100 (CET) Subject: How to implement a record de-constructor ? Message-ID: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> While a de-constructor is not strictly needed until we want to serialize a record or do Pattern matching on it, i think it can be useful to share what i think about implementing a de-constructor. Technically a de-constructor is the inverse operation of a constructor (hence the name), instead of being a function (int, int) -> Point, it's a function Point -> (int, int). In Java, it's something like this: record Point(int x, int y) { public (int, int) () { return (x, y); } } The issue is that there is no tuple in Java, so writing a de-constructor now is not easy. Valhalla is currently adding value type, and a tuple is a kind of anonymous value type, so while at some point in the future, Java will have tuples, delaying the introduction of records until we have tuples is not something we should do. So we have two ways to deal with the implementation of a de-constructor: 1/ internally, toString/equals or hashCode implementation of a record are already using a form of de-constructor as an array of method handles, one by fields, so a de-constructor is seen an an array of getters. We can add a method in the Reflection API, by example, on java.lang.Class that returns this array of method handles. In term of implementation, one can use a constant dynamic to create an immutable list of method handles and just need a way to poke this constant pool constant by example by storing the constant pool index in a class attribute. 2/ we can code the deconstructor in a way that let the calling code of the de-constructor to choose which kind of object it want to return, like this: record Point(int x, int y) { public Object (MethodHandle mh, Object o) { return mh.invokeExact(o, x, y); } } java.lang.Object being the root for all types in Java, value types or not, this will abstract the construction of a value type in the future. Sending a method handle as also the advantage that instead of creating a tuple, it can also be used to by example the the values inside a ByteBuffer (or any output stream), but for that we need to pass the ByteBuffer as a parameter hence the parameter o. This design as also the advantage of being compatible with the extractor of Scala, or any API that box the field values into one or more objects. regards, R?mi From brian.goetz at oracle.com Tue Mar 20 13:43:33 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 09:43:33 -0400 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: <632b3cfe-ae45-ef47-ab67-b02f8d31ae3f@oracle.com> > ?Note: I've been curious what explicit destructuring is expected to > look like. As have we all :) The underlying model for how destructuring works was discussed at my JVMLS talk last year (https://www.youtube.com/watch?v=n3_8YcYKScw) but it didn't explore the syntax of how you would declare a matcher either.? We're currently struggling with finding a way to express this that (a) supports all the desired degrees of freedom, (b) is not bizarre, and (c) can be mechanically translated to something efficient.? I have some ideas but as this is a feature that's much farther down the road (first we need basic pattern matching, then we need destructuring matching on records, before hand-written matchers are a requirement), I'd rather not distract the conversation with a syntax-oriented discussion.? I am working on a more concrete list of requirements for what explicit matchers need to support, but I've been sitting on that because I want to get the basics on a more solid footing first. From brian.goetz at oracle.com Tue Mar 20 13:49:56 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 09:49:56 -0400 Subject: How to implement a record de-constructor ? In-Reply-To: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> References: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> Message-ID: <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> > The issue is that there is no tuple in Java, so writing a de-constructor now is not easy. Nor performant, because even though will soon have something like tuples (that's records), which makes it a lot easier to write, you still have to contend with boxing (until we have value records.) > So we have two ways to deal with the implementation of a de-constructor: > 1/ internally, toString/equals or hashCode implementation of a record are already using a form of de-constructor as an array of method handles You'll recall this was the subject of my JVMLS talk last year. But having a reasonable mechanics to translate to doesn't immediately give us a clue about how to write it in the language; surely we're not exposing method handles to the user.? And there are a host of complicating factors; composition (delegating from one matcher to another, such as the super matcher), partiality vs totality (some patterns always match on a certain type, which the compiler can reason about; others have dynamic conditions like "palindromic string"), laziness (ideally, matching Foo(var x, _) where the second component is ignored shouldn't cause the second component to be computed), eagerness (sometimes you want a matcher to eagerly compute a snapshot, for purposes of data consistency), shared computations (sometimes there are expensive shared computations between the "does it match" calculation and the "extract the components" calculation, which you'd like not to redo, but this conflicts with laziness), inheritance of matchers, overriding of matchers, overloading of matchers, overload selection, and more.? It's a complex topic.? I propose we wait on deep-diving on this until we've gotten some more of the basics onto more solid footing. From brian.goetz at oracle.com Tue Mar 20 14:15:03 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 10:15:03 -0400 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: > > So add all this up and we have /three/?kind of finalness for fields: > > - by default mutable, but you can change it > - by default final, and you can't change it > - (and now) by default final, but you can change it > > This seems like quite a bad situation to me. > I think what you are really saying here is: if you want immutable records, wait for value records, don't try to cram them in early? Then a record inherits the finality of the class kind that it is describing.? And same with field accessibility. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Mar 20 15:29:23 2018 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 20 Mar 2018 11:29:23 -0400 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: <5A8E6994-F4E5-4636-BDB4-7F50E88EF0B5@oracle.com> > On Mar 20, 2018, at 10:15 AM, Brian Goetz wrote: > > >> >> So add all this up and we have three kind of finalness for fields: >> >> - by default mutable, but you can change it >> - by default final, and you can't change it >> - (and now) by default final, but you can change it >> >> This seems like quite a bad situation to me. >> > > I think what you are really saying here is: if you want immutable records, wait for value records, don't try to cram them in early? Then a record inherits the finality of the class kind that it is describing. And same with field accessibility. On its face, that sounds right to me. I wish ?value" could be the default for records, just as I wish ?final" had been the default all along for all fields and local variables. They?re the same issue. But that?s not our history, so we have to live with it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Mar 20 16:10:02 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 20 Mar 2018 17:10:02 +0100 (CET) Subject: Records -- current status In-Reply-To: <5A8E6994-F4E5-4636-BDB4-7F50E88EF0B5@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <5A8E6994-F4E5-4636-BDB4-7F50E88EF0B5@oracle.com> Message-ID: <1097813247.1217238.1521562202980.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 20 Mars 2018 16:29:23 > Objet: Re: Records -- current status >> On Mar 20, 2018, at 10:15 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >>> So add all this up and we have three kind of finalness for fields: >>> - by default mutable, but you can change it >>> - by default final, and you can't change it >>> - (and now) by default final, but you can change it >>> This seems like quite a bad situation to me. >> I think what you are really saying here is: if you want immutable records, wait >> for value records, don't try to cram them in early? Then a record inherits the >> finality of the class kind that it is describing. And same with field >> accessibility. > On its face, that sounds right to me. > I wish ?value" could be the default for records, just as I wish ?final" had been > the default all along for all fields and local variables. They?re the same > issue. But that?s not our history, so we have to live with it. I'me afraid that users will select value records to have immutability to after discover that they have lost the identity property. == between a value type and something else return false. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 20 16:34:00 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 12:34:00 -0400 Subject: Records -- current status In-Reply-To: <1097813247.1217238.1521562202980.JavaMail.zimbra@u-pem.fr> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <5A8E6994-F4E5-4636-BDB4-7F50E88EF0B5@oracle.com> <1097813247.1217238.1521562202980.JavaMail.zimbra@u-pem.fr> Message-ID: <9f5016ec-d833-3424-1336-cf4f78cfdd36@oracle.com> This will be an education problem with value types in general, records or not.? In fact, in the early days of value types, we observed that, because value types can have a more sensible default equals/hashCode/constructor/etc, we were worried that people might select value types for syntactic reasons.? Hence, in part, splitting records off and doing them first, and making it clear that the record axis is orthogonal to the value axis.? Of course, people may be confused about these new concepts anyway. On 3/20/2018 12:10 PM, Remi Forax wrote: > > I'me afraid that users will select value records to have immutability > to after discover that they have lost the identity property. > == between a value type and something else return false. > From forax at univ-mlv.fr Tue Mar 20 16:59:31 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 20 Mar 2018 17:59:31 +0100 (CET) Subject: How to implement a record de-constructor ? In-Reply-To: <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> References: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> Message-ID: <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > Envoy?: Mardi 20 Mars 2018 14:49:56 > Objet: Re: How to implement a record de-constructor ? >> The issue is that there is no tuple in Java, so writing a de-constructor now is >> not easy. > Nor performant, because even though will soon have something like tuples > (that's records), which makes it a lot easier to write, you still have > to contend with boxing (until we have value records.) > >> So we have two ways to deal with the implementation of a de-constructor: >> 1/ internally, toString/equals or hashCode implementation of a record are >> already using a form of de-constructor as an array of method handles > > You'll recall this was the subject of my JVMLS talk last year. i know, i had to change my talk because we had the same subject ... > > But having a reasonable mechanics to translate to doesn't immediately > give us a clue about how to write it in the language; surely we're not > exposing method handles to the user.? yes ! > And there are a host of > complicating factors; composition (delegating from one matcher to > another, such as the super matcher), It's another reason to not have abstract record, sorry, kidding. Allowing an explicit call to a de-constructor is more complex, again because we do not have tuples, but we do not have to support it because for records there is always the workaround to use the getter instead of delegating to another de-constructor. > partiality vs totality (some patterns always match on a certain type, which the compiler can reason > about; others have dynamic conditions like "palindromic string"), > laziness (ideally, matching Foo(var x, _) where the second component is > ignored shouldn't cause the second component to be computed), We are not talking about de-constructor here, right, we have moved to how to specify a matcher, which is not exactly the same issue (despite Scala merging the two concepts). if a de-constructor is inlined by the VM the same way getter is inlined by the VM, i.e. most of the time, and if the de-constructor has no side effect (otherwise you have bigger problem that just a performance issue) then the VM will remove unused computation, share common sub-expressions, etc. > eagerness (sometimes you want a matcher to eagerly compute a snapshot, for > purposes of data consistency), shared computations (sometimes there are > expensive shared computations between the "does it match" calculation > and the "extract the components" calculation, which you'd like not to > redo, but this conflicts with laziness), inheritance of matchers, > overriding of matchers, overloading of matchers, overload selection, and > more.? It's a complex topic.? I propose we wait on deep-diving on this > until we've gotten some more of the basics onto more solid footing. i think it's useful to separate the discussion about the de-constructor from the discussion about the matcher. R?mi From brian.goetz at oracle.com Tue Mar 20 17:06:39 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 20 Mar 2018 13:06:39 -0400 Subject: How to implement a record de-constructor ? In-Reply-To: <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> References: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> Message-ID: Your point is noted that the notion of a record deconstructor is a simpler case of general matchers.? However, is there an immediate need for explicit record deconstructors, since the compiler can generate the obvious one?? And if there is an immediate need, wouldn't it likely be motivated by the need for some of the more advanced features (partiality, snapshotting, etc)? > if a de-constructor is inlined by the VM the same way getter is inlined by the VM, i.e. most of the time, > and if the de-constructor has no side effect (otherwise you have bigger problem that just a performance issue) then the VM will remove unused computation, > share common sub-expressions, etc. I would love to buy this, but I don't.? The expensive actions of dtors are likely to be allocating and copying stuff; even if the results are unused, there are still many reasons why the JIT might not elide the allocation and copying. From forax at univ-mlv.fr Tue Mar 20 17:16:26 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 20 Mar 2018 18:16:26 +0100 (CET) Subject: How to implement a record de-constructor ? In-Reply-To: References: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> Message-ID: <2023568612.1246094.1521566186205.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mardi 20 Mars 2018 18:06:39 > Objet: Re: How to implement a record de-constructor ? > Your point is noted that the notion of a record deconstructor is a > simpler case of general matchers.? However, is there an immediate need > for explicit record deconstructors, since the compiler can generate the > obvious one?? And if there is an immediate need, wouldn't it likely be > motivated by the need for some of the more advanced features > (partiality, snapshotting, etc)? > >> if a de-constructor is inlined by the VM the same way getter is inlined by the >> VM, i.e. most of the time, >> and if the de-constructor has no side effect (otherwise you have bigger problem >> that just a performance issue) then the VM will remove unused computation, >> share common sub-expressions, etc. > > I would love to buy this, but I don't.? The expensive actions of dtors > are likely to be allocating and copying stuff; even if the results are > unused, there are still many reasons why the JIT might not elide the > allocation and copying. Apart if people still write side effects in their constructor in 2018, it should not be an issue, we are still missing frozen arrays but immutable collections (List.of+List.copyOf) is a good enough replacement. R?mi From Daniel_Heidinga at ca.ibm.com Tue Mar 20 20:31:00 2018 From: Daniel_Heidinga at ca.ibm.com (Daniel Heidinga) Date: Tue, 20 Mar 2018 20:31:00 +0000 Subject: How to implement a record de-constructor ? In-Reply-To: <2023568612.1246094.1521566186205.JavaMail.zimbra@u-pem.fr> References: <2023568612.1246094.1521566186205.JavaMail.zimbra@u-pem.fr>, <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> Message-ID: >> >>> if a de-constructor is inlined by the VM the same way getter is >inlined by the >>> VM, i.e. most of the time, >>> and if the de-constructor has no side effect (otherwise you have >bigger problem >>> that just a performance issue) then the VM will remove unused >computation, >>> share common sub-expressions, etc. Even ignoring side-effects, the cost model for a getter method is vastly different than invoking a MethodHandle from a List / array of MethodHandles, even if that List/array is rooted in the constant pool. The getter is an invokevirtual of cp data that the VM *knows* can't change. The MH is fetched from a mutable object (at least as far as the VM can trust) and then invoked. There's a lot of simulation overhead there - think emulating C++ vtables in C. >> >> I would love to buy this, but I don't. The expensive actions of >dtors >> are likely to be allocating and copying stuff; even if the results >are >> unused, there are still many reasons why the JIT might not elide >the >> allocation and copying. > >Apart if people still write side effects in their constructor in >2018, it should not be an issue, >we are still missing frozen arrays but immutable collections >(List.of+List.copyOf) is a good enough replacement. A frozen array would help as the VM could be taught the elements won't change. A List, even an immutable one, looks mutable to the VM. --Dan From maurizio.cimadamore at oracle.com Tue Mar 20 21:13:02 2018 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 20 Mar 2018 21:13:02 +0000 Subject: How to implement a record de-constructor ? In-Reply-To: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> References: <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> Message-ID: Just a digression on your premise - it's true that records hide a bit of the need for explicit deconstruction; but if you are a fan of the 'you don't need abstract records' camp (as I am :-)), I believe that means that you have to make up for the lack of abstract records with some kind of custom extractors - so there's some kind of tension between these two types of complexities (extension mechanism vs. explicit destructuring) here. Maurizio On 20/03/18 13:38, Remi Forax wrote: > While a de-constructor is not strictly needed until we want to serialize a record or do Pattern matching on it, > i think it can be useful to share what i think about implementing a de-constructor. From forax at univ-mlv.fr Wed Mar 21 08:06:38 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 21 Mar 2018 09:06:38 +0100 (CET) Subject: How to implement a record de-constructor ? In-Reply-To: References: <2023568612.1246094.1521566186205.JavaMail.zimbra@u-pem.fr> <1622589870.1012355.1521553110063.JavaMail.zimbra@u-pem.fr> <4d48a282-8575-52da-9616-6b08a311d361@oracle.com> <756643278.1237921.1521565171309.JavaMail.zimbra@u-pem.fr> Message-ID: <1791872002.1381662.1521619598482.JavaMail.zimbra@u-pem.fr> Hi Dan, ----- Mail original ----- > De: "Daniel Heidinga" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Mardi 20 Mars 2018 21:31:00 > Objet: Re: How to implement a record de-constructor ? >>> >>>> if a de-constructor is inlined by the VM the same way getter is >>inlined by the >>>> VM, i.e. most of the time, >>>> and if the de-constructor has no side effect (otherwise you have >>bigger problem >>>> that just a performance issue) then the VM will remove unused >>computation, >>>> share common sub-expressions, etc. > > Even ignoring side-effects, the cost model for a getter method is > vastly different than invoking a MethodHandle from a List / array > of MethodHandles, even if that List/array is rooted in the constant > pool. > > The getter is an invokevirtual of cp data that the VM *knows* can't > change. The MH is fetched from a mutable object (at least as far as > the VM can trust) and then invoked. > > There's a lot of simulation overhead there - think emulating C++ > vtables in C. > I think there is a misunderstanding here, i was talking about the option (2), where the de-constructor is record Point { Object (MethodHandle mh, Object o) { return mh.invokeExact(o, this.x, this.y); } } in that case as you see the the VM can do a pattern matching on the bytecode to see if it's a dumb extractor or not, like it does for a dumb getter. For option (1), where you get an array/list of method handles, as you said, things are more complicated. You can still for most operations like serialization, cloning, etc, help the VM by creating create one methodh handle from the list by folding them together. By example, for writing a record into an output stream, you can start from the list of getters [point -> point.x, point -> point.y] and fold them to one method handle that will write every field values on the output stream (outputStream, point) -> { outputStream.write(point.x); outputStream.write(point.y); } >>> >>> I would love to buy this, but I don't. The expensive actions of >>dtors >>> are likely to be allocating and copying stuff; even if the results >>are >>> unused, there are still many reasons why the JIT might not elide >>the >>> allocation and copying. >> >>Apart if people still write side effects in their constructor in >>2018, it should not be an issue, >>we are still missing frozen arrays but immutable collections >>(List.of+List.copyOf) is a good enough replacement. > > A frozen array would help as the VM could be taught the elements > won't change. A List, even an immutable one, looks mutable to > the VM. Yes, an immutable list like the result of List.of() doesn't help the VM but it avoid the user code to contains a defensive copy which helps the VM. > > --Dan R?mi From james.laskey at oracle.com Wed Mar 21 15:10:09 2018 From: james.laskey at oracle.com (Jim Laskey) Date: Wed, 21 Mar 2018 12:10:09 -0300 Subject: Raw String Literals Message-ID: <10B03F59-E391-4F94-B4FE-FD76537D72E8@oracle.com> We think things have "settled in" with respect to Raw String Literals language changes and library support. If all things fall in place, we will probably move http://openjdk.java.net/jeps/326 forward soon. We are hoping to make builds available, but if you want to test the waters. hg clone http://hg.openjdk.java.net/amber/amber amber-raw-string-literals cd amber-raw-string-literals hg update raw-string-literal ./configure ? make images docs export PLATFORM= ... java home is at build/${PLATFORM}/images/jdk javadoc for String library support is at build/${PLATFORM}/images/docs/api/java.base/java/lang/String.html Cheers, ? Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 18:03:20 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 14:03:20 -0400 Subject: Mutable records In-Reply-To: <4c05fc63-2ba5-778e-be91-dee735d03861@oracle.com> References: <4c05fc63-2ba5-778e-be91-dee735d03861@oracle.com> Message-ID: <96bbb47c-eeff-2cb8-4c59-ee0792c39c0a@oracle.com> A few people have asked, "wouldn't it just be easier to prohibit mutability in records"?? And while it surely would be easier (most of the issues I raised in my writeup go away without mutability), I think it would also greatly restrict the utility of the feature.? Let me talk about why, and give some examples -- and then I'd like to talk about what we can do, if anything, to make the mutable use cases easier. ## General argument: Mutability is pervasive in Java; you can only push it away a bit. We saw this with lambdas; developers are all too eager to "work around" the limitation on mutable local capture by wrapping their mutables in a one-element array.? In fact, IDEs even "helpfully" offers to do this for you, thus ensuring that everyone thinks this is OK. We will see this again with value types; even though value types are immutable, value types can contain references to mutable objects, and trying to enforce "values all the way down" would result in fairly useless value types. (That doesn't mean we can't nudge towards immutability where we think it makes sense, if we think the value of the nudge exceeds the irregularity or complexity it entails.) ## Records and value types: goals and similarities While records and value types have some features in common (getting equals() for free), they have different motivations. Value types are about treating aggregates as, well, values, with all the things that entails; they can be freely shared, the runtime can routinely optimize them by putting them on the stack or in registers and flatten them into enclosing values, classes, or arrays (yielding better density and flatness.)? What they ask you to give up in exchange is identity, which means giving up mutability and layout polymorphism. Records are about treating data as data; when modeling aggregates with records, the result is transparent classes whose API and representation are the same thing.? This means that records can be freely interconverted between their exploded and aggregate forms with no loss of information.? What they ask you to give up is the freedom to define the mapping between representation and API (constructors, accessors, equals, hashCode, deconstruction) in a nontransparent way.? (Essentially, you give up all encapsulation except for the ability to control writes to their state.) My claim is that the goals are mostly orthogonal, and the benefits and tradeoffs of each are as well.? All four quadrants make sense to me.? Some aggregates are values but not transparent (think cursors that hold references into the internals of a data structure, or hold a native resource); some are "just their data" but not values (graph nodes, as well as the mutable examples below), and others are both (value records). The superficial commonalities between records and values (both are restricted forms of aggregate, and these restrictions make it possible to provide sensible defaults for things like equals) tease us into thinking they are the same thing, but I don't think they are. Assuming this to be true, how can we justify having two new constructs?? Value types, by nature of what they require the developer to give up, enable the runtime to make significant optimizations it could not otherwise make.? So if we want flat and dense data, this is basically our only option -- make the programmer consent to the handcuffs.? The argument for records is more of a contingent one; records allow you to express more with less.? The "more with less" has at least two aspects; in addition to the obvious reduction in boilerplate, libraries and frameworks can make more reasonable assumptions about what construction or deconstruction means, and therefore can build useful functionality safely (such as marshaling to/from XML.)? But records don't let you do anything you can't already do with classes.? So if I had a quota, I'd have to pick values over records. In a language with values on the roadmap, immutable-only records seem to offer a pretty lame return-on-complexity.? Nothing about values requires you to use encapsulation, so you could model most immutable records with a value type, with less boilerplate than a class (but more than none), and the remainder with classes. (Immutable records buy you one thing that values do not -- pointer polymorphism.? That lets you make graphs or trees of them.)? But I think it is clear that this model of records is a kind of weird half-one, half-the-other thing, and its not entirely clear it would carry its weight. And, when users ask "why can't record components be mutable, after all, records are about data, and some data is mutable", I don't think we have a very good answer other than "immutability is good for you."? I much prefer the argument of "there are two orthogonal sets of tradeoffs; pick one, the other, or both." ## Use cases for mutable records Here are two use cases that immediately come to mind; please share others. Groups of related mutable state.? An example here is a set of counters.? If I define: ??? record CacheCounters(public int hitCount, public int accessCount) { ??????? float hitRate() { ... } ??? } then I can treat them as a single entity; store a counter-pair in a variable, have arrays of them, use them as values of Maps, pass them around, etc.? (The fact that they're mutable introduces constraints, but not new constraints; we deal with this problem every day.)? I can even lock on it, if that's how I want to do it. Domain objects.? Another common use is domain agregates: ??? record Person(String first, String last); If I want to marshal one of these to or from XML using a framework like JAXB, I can provide mapping metadata between XML schema and classes, and the framework will gladly populate the object for me.? The way these frameworks want to work is to instantiate a mutable object with a no-arg constructor, and then set the fields (or call setters) as components become available on the stream. Yes, you can write a binding framework that waits until it has all the stuff and then calls a N-arg constructor, but that's a lot harder, and uses a lot more memory.? Mutable records will play nicely with these frameworks. ## Embracing mutability I cheated a bit in the two examples I gave; neither had a no-arg constructor.? We could do a few things about this: ?- Make the user write a no-arg constructor (and hopefully make this easy enough) ?- Provide a no-arg constructor for all records that just pass the default values for that type to the default constructor (which might reject them, if it doesn't like nulls) ?- Try to provide a "minimal" constructor that only takes the final fields.? (I don't like this because changing a field between final and not changes the signature of an implicit constructor, which won't be binary compatible.) Similarly, you could object that deriving equals/hashCode from mutable state is dangerous.? (But List does do this.)? Again, there are a few ways to deal.? We could adjust the standard equals/hashCode to only take into account final fields.? But, I'm skeptical of this, because I could easily imagine people constructing records via mutation but then using them in an effectively immutable way thereafter, and they might want the stronger equals contract.? Or, we could tell people, as we do with List, not to use them as keys in hash-based collections.? (We could even have compiler warnings about this.) ## Additional considerations Here are a few less fundamental points about accepting mutable records, none of which are slam-dunks, but might still be useful to consider: ?- People will just work around it anyway, as they do with lambdas.? If a class has N-1 final fields, and one mutable one, what do we think they're going to do? ?- C# embraced mutable records.? This isn't surprising, but what is surprising is that Scala's case classes did also.? While I don't have data from either Neal or Martin, I suspect that they went through a similar analysis -- that it would leave out too many desirable use cases for the feature, and still not protect us from deeper mutability anyway. ?- Mutability introduces pain, but so does repetition and boilerplate -- it gives bugs a place to hide.? Making the feature less applicable consigns more users to using a more error-prone mechanism. ## Fields: final by default? One of the nudges we've considered is making fields final by default, but letting them be declared non-final.? This is a nudge, in that it sends a message that records are best served immutable, but if you want your revenge warm, you can have it.? I think there are reasonable arguments on both sides of this story, but one argument I am not particularly motivated by is "but then we'd have to introduce non-final as a keyword."? If we think final-by-default is a good idea, I don't think the lack of a denotation should be the impediment. ## Clone Clone is a mess, and I'm not sure there's a good answer here, but there's surely a good discussion. As a user, I find the ability to clone arrays (despite being shallow) is super useful, and it makes it far easier to be good about doing defensive copies all the time.? If cloning were harder (new array/arraycopy), I'd probably cut more corners.? If we can deliver the same benefit for records, that seems enticing. There's a fair argument over whether the standard clone should be shallow (easy to specify and implement) or should try to deeply clone Cloneable components.? Or maybe both options suck.? Or maybe it should be opt in; if the record extends Clonable, you get a clone() method. What did I miss? From brian.goetz at oracle.com Fri Mar 23 18:45:41 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 14:45:41 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> Message-ID: <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> Just want to sync and make sure we're on the same page here... Certain constructs (switch expression, switch statement, for/while/do) give meaning to some flavor of break.? Inside those, you can't use the other flavor, nor can you break "through" a construct of the opposite flavor. ??? switch-expression { ? ?? ?? break / break-label not allowed; ??????? break-expr allowed; ??????? continue, return not allowed; ??????? if (foo) { ? ?? ?????? break / break-label disallowed; ? ? ? ?? ?? break-expr allowed; ??????? } ? ?? ?? LOOP: ??????? for (...) { ??????????? break, continue allowed; ??????????? return not allowed; ? ?? ?????? break-label allowed if within the switch expression ??? ? ?? ?? break expression not allowed; ??????? } ??????? switch (e) { ??????????? // same as for loop ??????????? switch-expression { ??????????????? break expr allowed ??????????????? break, break-label, continue, return not allowed ??????? } ? ? } More formally; we can construct a table whose rows are the control constructs and whose columns are the nonlocal branches, and whose entries are "handlers" for a nonlocal branch.? Each block has a parent (the immediately enclosing block.) ? ?? ? ?? ? break-e?? break?? break-l?? continue?? return switch-e????? L???????? X?????? X???????? X????????? X switch-s????? X???????? L?????? P???????? L????????? P for?????????? X???????? L?????? P???????? L????????? P while???????? X???????? L?????? P???????? L????????? P block???????? P???????? P?????? P???????? P????????? P labeled?????? X???????? X L*??????? X????????? P lambda??????? X???????? X?????? X???????? X L method??????? X???????? X X???????? X????????? L The handlers mean: X -- not allowed P -- let the parent handle it L -- handle it and complete normally L* -- handle it and complete normally if the labels match, otherwise P (I might have mangled the labeled row, in which case surely Guy will correct me.)? The idea here is that each nested block acts as an "exception handler", catching some exceptions, and propagating others, and some contexts act as exception barriers, like trying to throw a "continue" out of a method. On 3/2/2018 9:30 AM, Remi Forax wrote: > Hi all, > as far as i remember, the current idea to differentiate between a break label and a break value is to let the compiler figure this out, > i wonder if it's not simpler to disallow break label (and continue label) inside an expression switch. > > After all, an expression switch do not exist yet, so no backward compatibility issue, it may make some refactoring impossible but had the great advantage to do not allow a lot of puzzler codes like the one below. > > enum Result { > ONE, MANY > } > > Result result(String[] args) { > ONE: for(String s: args) { > return switch(s) { > case "several": > case "many": > break MANY; > case "one": > break ONE; > default: > continue; > }; > } > throw ...; > } > > R?mi From guy.steele at oracle.com Fri Mar 23 18:41:07 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 23 Mar 2018 14:41:07 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> Message-ID: See comments in two places below. > On Mar 23, 2018, at 2:45 PM, Brian Goetz wrote: > > Just want to sync and make sure we're on the same page here... > > Certain constructs (switch expression, switch statement, for/while/do) give meaning to some flavor of break. Inside those, you can't use the other flavor, nor can you break "through" a construct of the opposite flavor. > > switch-expression { > break / break-label not allowed; > break-expr allowed; > continue, return not allowed; > > if (foo) { > break / break-label disallowed; > break-expr allowed; > } > > LOOP: > for (...) { > break, continue allowed; > return not allowed; > break-label allowed if within the switch expression > break expression not allowed; > } > > switch (e) { > // same as for loop Actually, same as for loop _except_ continue not allowed. > switch-expression { > break expr allowed > break, break-label, continue, return not allowed > } > } > > More formally; we can construct a table whose rows are the control constructs and whose columns are the nonlocal branches, and whose entries are "handlers" for a nonlocal branch. Each block has a parent (the immediately enclosing block.) > > break-e break break-l continue return > > switch-e L X X X X > switch-s X L P L P > for X L P L P > while X L P L P > block P P P P P > labeled X X L* X P > lambda X X X X L > method X X X X L > > The handlers mean: > > X -- not allowed > P -- let the parent handle it > L -- handle it and complete normally > L* -- handle it and complete normally if the labels match, otherwise P > > (I might have mangled the labeled row, in which case surely Guy will correct me.) Right. It should be: labeled P P L* P P ?In all other cases of abrupt completion of the Statement, the labeled statement completes abruptly for the same reason.? JLS SE8 14.7, page 413 All the other rows look good to me. > The idea here is that each nested block acts as an "exception handler", catching some exceptions, and propagating others, and some contexts act as exception barriers, like trying to throw a "continue" out of a method. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 23 19:33:58 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 23 Mar 2018 12:33:58 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> Message-ID: <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> On Mar 23, 2018, at 11:45 AM, Brian Goetz wrote: > > More formally; we can construct a table whose rows are the control constructs and whose columns are the nonlocal branches, and whose entries are "handlers" for a nonlocal branch. Each block has a parent (the immediately enclosing block.) It surprises me that break-e is so much more restricted than return, since I would expect we would aim at a design where an switch-e could be refactored incrementally to a method and vice versa (in cases where the e-switch had no side effects on locals). The symptom of this design choice is the large number of X entries in the break-e column, where there are few X entries in the return column. I suppose you are aiming in this direction to reduce occasional ambiguities between break-e and break-l. But such ambiguities can be controlled in other ways while getting more free passage of break-e to its enclosing switch, through intervening control flow. The break-e column assigns L to switch-e (obviously the root requirement) and X to everything else except block. I would expect the break-e column would assign P to everything except lambda and method, in symmetry with the return column, which assigns P to everything except switch-e (and naturally L to lambda and method). Specifically, this should not be ruled out, IMO: int x = switch (e) ( case 0: for (int y : someInts) { if (y < x) break (y); } return 0; default: return e; }; We have already discussed ways to deal with the ambiguity that could (rarely!!) arise if a name like "y" above also looks like a statement label. Reporting the ambiguity as an error is an easy thing to do, or quietly accepting the label in preference to the expression is also easy. Under either rule, users can use parens (as above) to make clear which kind of break statement they mean. Am I missing some other consideration? ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 19:41:51 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 15:41:51 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> Message-ID: <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> > The symptom of this design choice is the large number of X > entries in the break-e column, where there are few X entries > in the return column. > > I suppose you are aiming in this direction to reduce occasional > ambiguities between break-e and break-l. ?But such ambiguities > can be controlled in other ways while getting more free passage > of break-e to its enclosing switch, through intervening control > flow. I am not worried that there will be occasional ambiguities; I'm worried that the code reader will see "break X" in a switch statement and _not be able to know what it means_ without doing a nonlocal analysis.? Given that the requirement for a nested switch statement to break-e out of an enclosing switch expression already seems tenuous, it seemed best to segregate the break forms.? Either break/break-L is allowed in a given context, or break-E is allowed in that context, but not both. > Specifically, this should not be ruled out, IMO: > > int x = switch (e) ( > case 0: > ? for (int y : someInts) { > ? ? if (y < x) ?break (y); > ? } > ? return 0; > default: return e; > }; I would prefer to rule it out.? We don't allow returns out of the middle of a conditional, after all.? I prefer the simplicity that "all expressions either complete normally or complete abruptly with cause exception."? If you really need this kind of control flow, where maybe you yield a value or maybe you return, use a switch statement: ??? int result; ??? switch (e) { ??????? case 0: ??????????? for (int y : someInts) { ??????????????? if (y < x) { result = y; break ; } ??????????? } ?????????? return 0; ?????? default: return e; ??? } ??? // consume result here > We have already discussed ways to deal with the ambiguity that > could (rarely!!) arise if a name like "y" above also looks like a > statement > label. ?Reporting the ambiguity as an error is an easy thing to do, > or quietly accepting the label in preference to the expression is > also easy. ?Under either rule, users can use parens (as above) to > make clear which kind of break statement they mean. > > Am I missing some other consideration? > I think so.? Its not detecting the ambiguity, its that the possibility of mixing control flow kinds means the poor reader has to reason about all the possibilities, and not be sure what "break FOO" means in a switch statement. From john.r.rose at oracle.com Fri Mar 23 19:53:09 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 23 Mar 2018 12:53:09 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> Message-ID: On Mar 23, 2018, at 12:41 PM, Brian Goetz wrote: > > I think so. Its not detecting the ambiguity, its that the possibility of mixing control flow kinds means the poor reader has to reason about all the possibilities, and not be sure what "break FOO" means in a switch statement. A lighter fix, then, would be to require parentheses always. An even lighter fix would be leave that kind of clarification to style guides. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 19:58:03 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 15:58:03 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> Message-ID: <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> I think the burden should go the other way -- we should justify why such flexibility is warranted, and I think this is difficult to do. The conditions under which such control flow are useful are already small, and further, when you might want to code like that, the expression switch form offers relatively little benefit over expression statements anyway (since by definition you're using a lot of control-flow statements to make your complex control-flow decision.)? In exchange, the cost on all readers is high, because they can't be sure about what "break X" means.? Why burden all users with this complexity?? Why spend any of our complexity budget on such a low-leverage option? On 3/23/2018 3:53 PM, John Rose wrote: > On Mar 23, 2018, at 12:41 PM, Brian Goetz > wrote: >> >> I think so.? Its not detecting the ambiguity, its that the >> possibility of mixing control flow kinds means the poor reader has to >> reason about all the possibilities, and not be sure what "break FOO" >> means in a switch statement. > > A lighter fix, then, would be to require parentheses always. > An even lighter fix would be leave that kind of clarification to style > guides. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lowasser at google.com Fri Mar 23 20:13:09 2018 From: lowasser at google.com (Louis Wasserman) Date: Fri, 23 Mar 2018 20:13:09 +0000 Subject: Mutable records In-Reply-To: <96bbb47c-eeff-2cb8-4c59-ee0792c39c0a@oracle.com> References: <4c05fc63-2ba5-778e-be91-dee735d03861@oracle.com> <96bbb47c-eeff-2cb8-4c59-ee0792c39c0a@oracle.com> Message-ID: FWIW, some data: Our AutoValue tool at Google has been *extremely* successful and forbids mutation. - Despite only being added relatively recently compared to the age of the codebase, a full 25% of all getters in our codebase are written with AutoValue. - We provide a @Memoized annotation, that memoizes the results of methods, and tooling for generating builders. - We don't forbid mutable field types. I can confidently say that we've never regretted the decision to only support immutable objects. To give some broader context, I analyzed all instance fields in the Google codebase, and came up with some edifying numbers: - Outside of generated code, 71% of all instance fields in our codebase are never assigned to outside of constructors for the class where they are defined, which I'd call "effectively final." - That number goes up one or two percent if you exclude test code and/or exclude fields in classes whose names end with "Builder". - It goes up to 76% if you include AutoValue "fields" (the fields are only declared as normal Java fields in generated code, which we'd excluded). - 67% of all classes with at least one field have all effectively final fields. On Fri, Mar 23, 2018 at 11:03 AM Brian Goetz wrote: > A few people have asked, "wouldn't it just be easier to prohibit > mutability in records"? And while it surely would be easier (most of > the issues I raised in my writeup go away without mutability), I think > it would also greatly restrict the utility of the feature. Let me talk > about why, and give some examples -- and then I'd like to talk about > what we can do, if anything, to make the mutable use cases easier. > > ## General argument: Mutability is pervasive in Java; you can only push > it away a bit. > > We saw this with lambdas; developers are all too eager to "work around" > the limitation on mutable local capture by wrapping their mutables in a > one-element array. In fact, IDEs even "helpfully" offers to do this for > you, thus ensuring that everyone thinks this is OK. > > We will see this again with value types; even though value types are > immutable, value types can contain references to mutable objects, and > trying to enforce "values all the way down" would result in fairly > useless value types. > > (That doesn't mean we can't nudge towards immutability where we think it > makes sense, if we think the value of the nudge exceeds the irregularity > or complexity it entails.) > > ## Records and value types: goals and similarities > > While records and value types have some features in common (getting > equals() for free), they have different motivations. > > Value types are about treating aggregates as, well, values, with all the > things that entails; they can be freely shared, the runtime can > routinely optimize them by putting them on the stack or in registers and > flatten them into enclosing values, classes, or arrays (yielding better > density and flatness.) What they ask you to give up in exchange is > identity, which means giving up mutability and layout polymorphism. > > Records are about treating data as data; when modeling aggregates with > records, the result is transparent classes whose API and representation > are the same thing. This means that records can be freely > interconverted between their exploded and aggregate forms with no loss > of information. What they ask you to give up is the freedom to define > the mapping between representation and API (constructors, accessors, > equals, hashCode, deconstruction) in a nontransparent way. > (Essentially, you give up all encapsulation except for the ability to > control writes to their state.) > > My claim is that the goals are mostly orthogonal, and the benefits and > tradeoffs of each are as well. All four quadrants make sense to me. > Some aggregates are values but not transparent (think cursors that hold > references into the internals of a data structure, or hold a native > resource); some are "just their data" but not values (graph nodes, as > well as the mutable examples below), and others are both (value records). > > The superficial commonalities between records and values (both are > restricted forms of aggregate, and these restrictions make it possible > to provide sensible defaults for things like equals) tease us into > thinking they are the same thing, but I don't think they are. > > Assuming this to be true, how can we justify having two new constructs? > Value types, by nature of what they require the developer to give up, > enable the runtime to make significant optimizations it could not > otherwise make. So if we want flat and dense data, this is basically > our only option -- make the programmer consent to the handcuffs. The > argument for records is more of a contingent one; records allow you to > express more with less. The "more with less" has at least two aspects; > in addition to the obvious reduction in boilerplate, libraries and > frameworks can make more reasonable assumptions about what construction > or deconstruction means, and therefore can build useful functionality > safely (such as marshaling to/from XML.) But records don't let you do > anything you can't already do with classes. So if I had a quota, I'd > have to pick values over records. > > In a language with values on the roadmap, immutable-only records seem to > offer a pretty lame return-on-complexity. Nothing about values requires > you to use encapsulation, so you could model most immutable records with > a value type, with less boilerplate than a class (but more than none), > and the remainder with classes. (Immutable records buy you one thing > that values do not -- pointer polymorphism. That lets you make graphs > or trees of them.) But I think it is clear that this model of records > is a kind of weird half-one, half-the-other thing, and its not entirely > clear it would carry its weight. > > And, when users ask "why can't record components be mutable, after all, > records are about data, and some data is mutable", I don't think we have > a very good answer other than "immutability is good for you." I much > prefer the argument of "there are two orthogonal sets of tradeoffs; pick > one, the other, or both." > > ## Use cases for mutable records > > Here are two use cases that immediately come to mind; please share others. > > Groups of related mutable state. An example here is a set of counters. > If I define: > > record CacheCounters(public int hitCount, public int accessCount) { > float hitRate() { ... } > } > > then I can treat them as a single entity; store a counter-pair in a > variable, have arrays of them, use them as values of Maps, pass them > around, etc. (The fact that they're mutable introduces constraints, but > not new constraints; we deal with this problem every day.) I can even > lock on it, if that's how I want to do it. > > Domain objects. Another common use is domain agregates: > > record Person(String first, String last); > > If I want to marshal one of these to or from XML using a framework like > JAXB, I can provide mapping metadata between XML schema and classes, and > the framework will gladly populate the object for me. The way these > frameworks want to work is to instantiate a mutable object with a no-arg > constructor, and then set the fields (or call setters) as components > become available on the stream. Yes, you can write a binding framework > that waits until it has all the stuff and then calls a N-arg > constructor, but that's a lot harder, and uses a lot more memory. > Mutable records will play nicely with these frameworks. > > ## Embracing mutability > > I cheated a bit in the two examples I gave; neither had a no-arg > constructor. We could do a few things about this: > - Make the user write a no-arg constructor (and hopefully make this > easy enough) > - Provide a no-arg constructor for all records that just pass the > default values for that type to the default constructor (which might > reject them, if it doesn't like nulls) > - Try to provide a "minimal" constructor that only takes the final > fields. (I don't like this because changing a field between final and > not changes the signature of an implicit constructor, which won't be > binary compatible.) > > Similarly, you could object that deriving equals/hashCode from mutable > state is dangerous. (But List does do this.) Again, there are a few > ways to deal. We could adjust the standard equals/hashCode to only take > into account final fields. But, I'm skeptical of this, because I could > easily imagine people constructing records via mutation but then using > them in an effectively immutable way thereafter, and they might want the > stronger equals contract. Or, we could tell people, as we do with List, > not to use them as keys in hash-based collections. (We could even have > compiler warnings about this.) > > ## Additional considerations > > Here are a few less fundamental points about accepting mutable records, > none of which are slam-dunks, but might still be useful to consider: > - People will just work around it anyway, as they do with lambdas. If > a class has N-1 final fields, and one mutable one, what do we think > they're going to do? > - C# embraced mutable records. This isn't surprising, but what is > surprising is that Scala's case classes did also. While I don't have > data from either Neal or Martin, I suspect that they went through a > similar analysis -- that it would leave out too many desirable use cases > for the feature, and still not protect us from deeper mutability anyway. > - Mutability introduces pain, but so does repetition and boilerplate > -- it gives bugs a place to hide. Making the feature less applicable > consigns more users to using a more error-prone mechanism. > > ## Fields: final by default? > > One of the nudges we've considered is making fields final by default, > but letting them be declared non-final. This is a nudge, in that it > sends a message that records are best served immutable, but if you want > your revenge warm, you can have it. I think there are reasonable > arguments on both sides of this story, but one argument I am not > particularly motivated by is "but then we'd have to introduce non-final > as a keyword." If we think final-by-default is a good idea, I don't > think the lack of a denotation should be the impediment. > > ## Clone > > Clone is a mess, and I'm not sure there's a good answer here, but > there's surely a good discussion. > > As a user, I find the ability to clone arrays (despite being shallow) is > super useful, and it makes it far easier to be good about doing > defensive copies all the time. If cloning were harder (new > array/arraycopy), I'd probably cut more corners. If we can deliver the > same benefit for records, that seems enticing. > > There's a fair argument over whether the standard clone should be > shallow (easy to specify and implement) or should try to deeply clone > Cloneable components. Or maybe both options suck. Or maybe it should > be opt in; if the record extends Clonable, you get a clone() method. > > > What did I miss? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 23 20:18:45 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 23 Mar 2018 13:18:45 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> Message-ID: Just want to echo that I think prohibiting labeled break from e-switch is great for *both* reasons - keeping expressions functional in nature, *and* addressing the ambiguity in the argument after `break`. Strong +1. On Fri, Mar 23, 2018 at 12:58 PM, Brian Goetz wrote: > I think the burden should go the other way -- we should justify why such > flexibility is warranted, and I think this is difficult to do. > > The conditions under which such control flow are useful are already small, > and further, when you might want to code like that, the expression switch > form offers relatively little benefit over expression statements anyway > (since by definition you're using a lot of control-flow statements to make > your complex control-flow decision.) In exchange, the cost on all readers > is high, because they can't be sure about what "break X" means. Why burden > all users with this complexity? Why spend any of our complexity budget on > such a low-leverage option? > > On 3/23/2018 3:53 PM, John Rose wrote: > > On Mar 23, 2018, at 12:41 PM, Brian Goetz wrote: > > > I think so. Its not detecting the ambiguity, its that the possibility of > mixing control flow kinds means the poor reader has to reason about all the > possibilities, and not be sure what "break FOO" means in a switch statement. > > > A lighter fix, then, would be to require parentheses always. > An even lighter fix would be leave that kind of clarification to style > guides. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 23 20:20:04 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 23 Mar 2018 13:20:04 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> Message-ID: <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> On Mar 23, 2018, at 12:58 PM, Brian Goetz wrote: > > In exchange, the cost on all readers is high, because they can't be sure about what "break X" means. Why burden all users with this complexity? Why > spend any of our complexity budget on such a low-leverage option? It seems to me that your argument cuts in a different direction: If the important risk is that users can't be sure what "break X" means without a non-local scan, we should absolutely require the parentheses. After all, an e-switch can be a large construct containing its own control flow, and the user can run into a "break X" in the middle of a page of code without knowing what kind of break it is. So, we could fix the main problem by requiring "break X" to always mean the label X and "break (x)" to always mean the value x. i just realized that the X-heavy first column also means that I can't make use of statement labels *anywhere inside* of e-switch, even if I just copy-and-pasted self-contained code from elsewhere. On Mar 23, 2018, at 12:41 PM, Brian Goetz wrote: > > I would prefer to rule it out. We don't allow returns out of the middle of a conditional, after all. I prefer the simplicity that "all expressions either complete normally or complete abruptly with cause exception." If you really need this kind of control flow, where maybe you yield a value or maybe you return, use a switch statement: > > int result; > switch (e) { > case 0: > for (int y : someInts) { > if (y < x) { result = y; break ; } > } > return 0; > default: return e; > } > // consume result here In the above refactoring I *must* use label-free break to escape from the "for" statement. Which means I don't have access to the normal range of Java control flow constructs inside of an e-switch. This is unusual: In most ways, wherever Java allows a single statement, it also allows a block with any variety of nested statements. (We pushed this pretty far in Java 1.1 with inner classes, for example.) The effect of your proposal here is that e-switches can only contain a simplified sub-language of the Java statement language. I'll bet you view that as a positive, since it will tend to push people away from writing really complex stuff inside of e-switches, but I see it as a sharp edge. The "will it refactor" heuristic exposes it nicely; again: I can't freely refactor between method body and e-switch if my method body contains labeled statements. And this limit goes in because the poor user won't deal well with "break E"? I think mandating "break (e)" is a lighter weight solution, even though it has some of the #{ Stand Out }# smell. ? John From john.r.rose at oracle.com Fri Mar 23 20:33:09 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 23 Mar 2018 13:33:09 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> Message-ID: On Mar 23, 2018, at 1:20 PM, John Rose wrote: > > I think mandating "break (e)" is a lighter > weight solution, even though it has some of the > #{ Stand Out }# smell. > > ? John P.S. And to be clear, I'm simply pointing out surprising consequences of your design, to elucidate its trade-offs. I won't be too disappointed if we go with the odd new contextual restriction on "break-l", since "break-l" is a marginal feature only seldom used. It can be relegated to a helper method as a last resort. But, again, mandating a helper method to get out of a tight spot can be a design smell. Whatever we decide, let's document the consequences on coders as best as we can. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 20:33:24 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 16:33:24 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> Message-ID: > It seems to me that your argument cuts in a different direction: If the > important risk is that users can't be sure what "break X" means without > a non-local scan, we should absolutely require the parentheses. I think that's a reasonable conversation -- but it wouldn't make me any more interested in softening the break-e rules.? That feels like all downside and no upside to me.?? And, requiring the parens may well just be seen as fussiness on the part of the compiler, so I'm not sure of the point. > So, we could fix the main problem by requiring "break X" to always > mean the label X and "break (x)" to always mean the value x. I don't think "expressions always complete normally or throw" is a "problem" to be "fixed" -- I think its a feature! > i just realized that the X-heavy first column also means that I can't > make use of statement labels *anywhere inside* of e-switch, even > if I just copy-and-pasted self-contained code from elsewhere. No, you can do this: ??? e-switch { ??????? LABEL: ??????? s-switch { ??????????? s-switch { break LABEL; } ??????? } ??? } You just can't "throw" *across* an e-switch boundary.? The X means if a nested construct doesn't handle it, there's a problem. From guy.steele at oracle.com Fri Mar 23 20:51:22 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 23 Mar 2018 16:51:22 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> Message-ID: String s = switch (e) { case 0 -> break ?foofoo?; case 1: if (p == 0) break x; else { String z = hairy(x); break z+z; } case 2 -> ?barbar?; }; Now I decide that case 1 has three subcases. So I change the `if` to a statement `switch`. String s = switch (e) { case 0 -> break ?foofoo?; case 1: switch (p) { case 0: break x; case 1: break x+x; default: String z = hairy(x); break z+z; } case 2 -> ?barbar?; }; FAIL. One can argue that I should have done something else, such as use a three-way if-then-else: String s = switch (e) { case 0 -> break ?foofoo?; case 1: if (p == 0) break x; else if (p == 1) break x+x; else { String z = hairy(x); break z+z; } case 2 -> ?barbar?; }; or use a switch expression rather than a statement switch: String s = switch (e) { case 0 -> break ?foofoo?; case 1 -> switch (p) { case 0 -> x; case 1 -> x+x; default: String z = hairy(x); break z+z; } case 2 -> ?barbar?; }; All I?m doing is demonstrating that a common refactoring pattern ?turn the `if` statement into a `switch` statement? has more pitfalls than it used to once we introduce `switch` expressions. ?Guy From john.r.rose at oracle.com Fri Mar 23 21:36:24 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 23 Mar 2018 14:36:24 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> Message-ID: On Mar 23, 2018, at 1:33 PM, Brian Goetz wrote: > >> i just realized that the X-heavy first column also means that I can't >> make use of statement labels *anywhere inside* of e-switch, even >> if I just copy-and-pasted self-contained code from elsewhere. > > No, you can do this: > > e-switch { > LABEL: > s-switch { > s-switch { break LABEL; } > } > } (Looking again at the table.) Yes, I see; the break-l has a P. OK, I was reading the table wrong. > You just can't "throw" *across* an e-switch boundary. The X means > if a nested construct doesn't handle it, there's a problem. On that we all agree: e-switch, like any other e, can complete normally with a value, or else can complete with a throw. It *cannot* complete with a branch to some location other than a catch statement (which location would be defined by a label, loop, or switch outside the switch, if it were anywhere, but it isn't). OK, so half of my discomfort was a temporary misunderstanding. You can paste locally consistent control flow in and out of switch-e's, even if the control flow uses statement labels. Good. The rest of it is about where to put the sharp edges: Can I break-e from a switch-e wherever I might consider doing return from a method/lambda? Or does break-e have extra restrictions to prevent certain ambiguities? Your answer is the latter. Speaking of ambiguities, should this be illegal, even though under your rules it happens to be unambiguous? Or is it just a puzzler we tolerate? e-switch { LABEL: s-switch { { int LABEL = 2; break LABEL; } } } Also the other way: LABEL: s-switch { e-switch { int LABEL = 2; break LABEL; } } You want "break LABEL" to be immediately recognized as either a break-l or a break-e. The above cases seem to make it hard to do so. We could declare such code pathological and demand a relabel in either case, just as we declare local variable shadowing pathological in most cases, demanding a renaming of one of the variables. Local variable shadowing is more likely to occur than label shadowing, given that labels are a rare construct, so maybe we just let the above be a puzzler, rather than add a rule. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 21:42:14 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 17:42:14 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> Message-ID: <8cc71e72-438d-60fd-8279-2d29be7bb1e0@oracle.com> > > The rest of it is about where to put the sharp edges: ?Can I > break-e from a switch-e wherever I might consider doing return > from a method/lambda? ?Or does break-e have extra restrictions > to prevent certain ambiguities? ?Your answer is the latter. Right.? To avoid ambiguity with other breaky contexts, we let the innermost breaky context determine the allowable break modes. > Speaking of ambiguities, should this be illegal, even > though under your rules it happens to be unambiguous? > Or is it just a puzzler we tolerate? > > ? ? e-switch { > ??????? LABEL: > ??????? s-switch { > ? ? ? ? ? ?{ int LABEL = 2; break LABEL; } > ??????? } > ??? } It could be allowed (since you can't break-e from the switch), but it seems safer to call it a compile error.? After all, you can always alpha-rename the label. > Also the other way: > > ? ? LABEL: > ?? ?s-switch { > ? ? ? ? e-switch { int LABEL = 2; break LABEL; } > ? ? } > > You want "break LABEL" to be immediately recognized as > either a break-l or a break-e. ?The above cases seem to > make it hard to do so. ?We could declare such code > pathological and demand a relabel in either case, > just as we declare local variable shadowing pathological > in most cases, demanding a renaming of one of the > variables. ?Local variable shadowing is more likely > to occur than label shadowing, given that labels > are a rare construct, so maybe we just let the > above be a puzzler, rather than add a rule. Or, make it the same rule.? If in "break x", x could resolve as either a label or an expression, call it an ambiguity.? (In either case, depending on whether you rename the variable or the label, you could then get a different error, which is "we don't serve that kind of break round these parts.")? But I think its the same game either way. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 23 21:49:46 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Mar 2018 17:49:46 -0400 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <8cc71e72-438d-60fd-8279-2d29be7bb1e0@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> <9533cc42-6c3c-e012-2df2-59105bd0bbb9@oracle.com> <52A7D597-4E8F-4A14-BF40-A550DA83DF64@oracle.com> <8cc71e72-438d-60fd-8279-2d29be7bb1e0@oracle.com> Message-ID: <61eb96c1-5d67-4af4-226a-f52e9317f5aa@oracle.com> Stepping back, I think there's two ways to look at this: ?- break expression and break label are totally different statements, that happen to be spelled similarly ?- break is the same statement all around, but just as return requires a value in a value-returning method and requires no value in a void method, the meaning of break must agree with the innermost breaky context. I think the latter is far easier for users to reason about, while giving up relatively little flexibility.? So doing a "break label" in an e-switch is the same error as doing "return 3" in a void method. On 3/23/2018 5:42 PM, Brian Goetz wrote: > >> >> The rest of it is about where to put the sharp edges: ?Can I >> break-e from a switch-e wherever I might consider doing return >> from a method/lambda? ?Or does break-e have extra restrictions >> to prevent certain ambiguities? ?Your answer is the latter. > > Right.? To avoid ambiguity with other breaky contexts, we let the > innermost breaky context determine the allowable break modes. > >> Speaking of ambiguities, should this be illegal, even >> though under your rules it happens to be unambiguous? >> Or is it just a puzzler we tolerate? >> >> ? ? e-switch { >> ??????? LABEL: >> ??????? s-switch { >> ? ? ? ? ? ?{ int LABEL = 2; break LABEL; } >> ??????? } >> ??? } > > It could be allowed (since you can't break-e from the switch), but it > seems safer to call it a compile error.? After all, you can always > alpha-rename the label. > >> Also the other way: >> >> ? ? LABEL: >> ?? ?s-switch { >> ? ? ? ? e-switch { int LABEL = 2; break LABEL; } >> ? ? } >> >> You want "break LABEL" to be immediately recognized as >> either a break-l or a break-e. ?The above cases seem to >> make it hard to do so. ?We could declare such code >> pathological and demand a relabel in either case, >> just as we declare local variable shadowing pathological >> in most cases, demanding a renaming of one of the >> variables. ?Local variable shadowing is more likely >> to occur than label shadowing, given that labels >> are a rare construct, so maybe we just let the >> above be a puzzler, rather than add a rule. > > Or, make it the same rule.? If in "break x", x could resolve as either > a label or an expression, call it an ambiguity.? (In either case, > depending on whether you rename the variable or the label, you could > then get a different error, which is "we don't serve that kind of > break round these parts.")? But I think its the same game either way. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.petrashko at gmail.com Mon Mar 26 18:31:58 2018 From: dmitry.petrashko at gmail.com (Dmitry Petrashko) Date: Mon, 26 Mar 2018 11:31:58 -0700 Subject: Mutable records In-Reply-To: <96bbb47c-eeff-2cb8-4c59-ee0792c39c0a@oracle.com> References: <4c05fc63-2ba5-778e-be91-dee735d03861@oracle.com> <96bbb47c-eeff-2cb8-4c59-ee0792c39c0a@oracle.com> Message-ID: Hi Brian, > What did I miss? There is one more usecase for mutable record types in C++ structs in which I find some value. Consider a method that requires a dozen of arguments. In case this method serves a single purpose, those argument are very likely to be somehow related. Instead of passing 10 arguments separately they can be grouped into structures that both make it easier to document the code and to write it. Tuples don't serve this purpose well, a they don't give names or explanations, while classes are too verbose. In order to keep those structured arguments on par with normal arguments in Java, they have to be mutable, as normal arguments are re-assignable in Java, unlike in Scala. > - People will just work around it anyway, as they do with lambdas. If a class has N-1 final fields, and one mutable one, what do we think they're going to do? > - Mutability introduces pain, but so does repetition and boilerplate -- it gives bugs a place to hide. Making the feature less applicable consigns more users to using a more error-prone mechanism. In case records are just syntactic sugar that does not have magic that cannot be reproduced by expanding the definition side, it might be worth considering making records immutable only in case you want to steer culture towards immutability. As long as they can expand the definition site manually without changing public API, they can do it. This would also sidestep adding a new keyword to the language. -Dmitry On Fri, Mar 23, 2018 at 11:03 AM, Brian Goetz wrote: > A few people have asked, "wouldn't it just be easier to prohibit > mutability in records"? And while it surely would be easier (most of the > issues I raised in my writeup go away without mutability), I think it would > also greatly restrict the utility of the feature. Let me talk about why, > and give some examples -- and then I'd like to talk about what we can do, > if anything, to make the mutable use cases easier. > > ## General argument: Mutability is pervasive in Java; you can only push it > away a bit. > > We saw this with lambdas; developers are all too eager to "work around" > the limitation on mutable local capture by wrapping their mutables in a > one-element array. In fact, IDEs even "helpfully" offers to do this for > you, thus ensuring that everyone thinks this is OK. > > We will see this again with value types; even though value types are > immutable, value types can contain references to mutable objects, and > trying to enforce "values all the way down" would result in fairly useless > value types. > > (That doesn't mean we can't nudge towards immutability where we think it > makes sense, if we think the value of the nudge exceeds the irregularity or > complexity it entails.) > > ## Records and value types: goals and similarities > > While records and value types have some features in common (getting > equals() for free), they have different motivations. > > Value types are about treating aggregates as, well, values, with all the > things that entails; they can be freely shared, the runtime can routinely > optimize them by putting them on the stack or in registers and flatten them > into enclosing values, classes, or arrays (yielding better density and > flatness.) What they ask you to give up in exchange is identity, which > means giving up mutability and layout polymorphism. > > Records are about treating data as data; when modeling aggregates with > records, the result is transparent classes whose API and representation are > the same thing. This means that records can be freely interconverted > between their exploded and aggregate forms with no loss of information. > What they ask you to give up is the freedom to define the mapping between > representation and API (constructors, accessors, equals, hashCode, > deconstruction) in a nontransparent way. (Essentially, you give up all > encapsulation except for the ability to control writes to their state.) > > My claim is that the goals are mostly orthogonal, and the benefits and > tradeoffs of each are as well. All four quadrants make sense to me. Some > aggregates are values but not transparent (think cursors that hold > references into the internals of a data structure, or hold a native > resource); some are "just their data" but not values (graph nodes, as well > as the mutable examples below), and others are both (value records). > > The superficial commonalities between records and values (both are > restricted forms of aggregate, and these restrictions make it possible to > provide sensible defaults for things like equals) tease us into thinking > they are the same thing, but I don't think they are. > > Assuming this to be true, how can we justify having two new constructs? > Value types, by nature of what they require the developer to give up, > enable the runtime to make significant optimizations it could not otherwise > make. So if we want flat and dense data, this is basically our only option > -- make the programmer consent to the handcuffs. The argument for records > is more of a contingent one; records allow you to express more with less. > The "more with less" has at least two aspects; in addition to the obvious > reduction in boilerplate, libraries and frameworks can make more reasonable > assumptions about what construction or deconstruction means, and therefore > can build useful functionality safely (such as marshaling to/from XML.) > But records don't let you do anything you can't already do with classes. > So if I had a quota, I'd have to pick values over records. > > In a language with values on the roadmap, immutable-only records seem to > offer a pretty lame return-on-complexity. Nothing about values requires > you to use encapsulation, so you could model most immutable records with a > value type, with less boilerplate than a class (but more than none), and > the remainder with classes. (Immutable records buy you one thing that > values do not -- pointer polymorphism. That lets you make graphs or trees > of them.) But I think it is clear that this model of records is a kind of > weird half-one, half-the-other thing, and its not entirely clear it would > carry its weight. > > And, when users ask "why can't record components be mutable, after all, > records are about data, and some data is mutable", I don't think we have a > very good answer other than "immutability is good for you." I much prefer > the argument of "there are two orthogonal sets of tradeoffs; pick one, the > other, or both." > > ## Use cases for mutable records > > Here are two use cases that immediately come to mind; please share others. > > Groups of related mutable state. An example here is a set of counters. > If I define: > > record CacheCounters(public int hitCount, public int accessCount) { > float hitRate() { ... } > } > > then I can treat them as a single entity; store a counter-pair in a > variable, have arrays of them, use them as values of Maps, pass them > around, etc. (The fact that they're mutable introduces constraints, but > not new constraints; we deal with this problem every day.) I can even lock > on it, if that's how I want to do it. > > Domain objects. Another common use is domain agregates: > > record Person(String first, String last); > > If I want to marshal one of these to or from XML using a framework like > JAXB, I can provide mapping metadata between XML schema and classes, and > the framework will gladly populate the object for me. The way these > frameworks want to work is to instantiate a mutable object with a no-arg > constructor, and then set the fields (or call setters) as components become > available on the stream. Yes, you can write a binding framework that waits > until it has all the stuff and then calls a N-arg constructor, but that's a > lot harder, and uses a lot more memory. Mutable records will play nicely > with these frameworks. > > ## Embracing mutability > > I cheated a bit in the two examples I gave; neither had a no-arg > constructor. We could do a few things about this: > - Make the user write a no-arg constructor (and hopefully make this easy > enough) > - Provide a no-arg constructor for all records that just pass the default > values for that type to the default constructor (which might reject them, > if it doesn't like nulls) > - Try to provide a "minimal" constructor that only takes the final > fields. (I don't like this because changing a field between final and not > changes the signature of an implicit constructor, which won't be binary > compatible.) > > Similarly, you could object that deriving equals/hashCode from mutable > state is dangerous. (But List does do this.) Again, there are a few ways > to deal. We could adjust the standard equals/hashCode to only take into > account final fields. But, I'm skeptical of this, because I could easily > imagine people constructing records via mutation but then using them in an > effectively immutable way thereafter, and they might want the stronger > equals contract. Or, we could tell people, as we do with List, not to use > them as keys in hash-based collections. (We could even have compiler > warnings about this.) > > ## Additional considerations > > Here are a few less fundamental points about accepting mutable records, > none of which are slam-dunks, but might still be useful to consider: > - People will just work around it anyway, as they do with lambdas. If a > class has N-1 final fields, and one mutable one, what do we think they're > going to do? > - C# embraced mutable records. This isn't surprising, but what is > surprising is that Scala's case classes did also. While I don't have data > from either Neal or Martin, I suspect that they went through a similar > analysis -- that it would leave out too many desirable use cases for the > feature, and still not protect us from deeper mutability anyway. > - Mutability introduces pain, but so does repetition and boilerplate -- > it gives bugs a place to hide. Making the feature less applicable consigns > more users to using a more error-prone mechanism. > > ## Fields: final by default? > > One of the nudges we've considered is making fields final by default, but > letting them be declared non-final. This is a nudge, in that it sends a > message that records are best served immutable, but if you want your > revenge warm, you can have it. I think there are reasonable arguments on > both sides of this story, but one argument I am not particularly motivated > by is "but then we'd have to introduce non-final as a keyword." If we > think final-by-default is a good idea, I don't think the lack of a > denotation should be the impediment. > > ## Clone > > Clone is a mess, and I'm not sure there's a good answer here, but there's > surely a good discussion. > > As a user, I find the ability to clone arrays (despite being shallow) is > super useful, and it makes it far easier to be good about doing defensive > copies all the time. If cloning were harder (new array/arraycopy), I'd > probably cut more corners. If we can deliver the same benefit for records, > that seems enticing. > > There's a fair argument over whether the standard clone should be shallow > (easy to specify and implement) or should try to deeply clone Cloneable > components. Or maybe both options suck. Or maybe it should be opt in; if > the record extends Clonable, you get a clone() method. > > > What did I miss? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.petrashko at gmail.com Mon Mar 26 19:16:41 2018 From: dmitry.petrashko at gmail.com (Dmitry Petrashko) Date: Mon, 26 Mar 2018 12:16:41 -0700 Subject: Records -- current status In-Reply-To: References: Message-ID: Hi Brian, thank you for your work on this and this wonderful writeup. After working both as Scala developer and Scala compiler developer those design areas are very pleasant to reiterate and try to think when\if Scala approach will work well in Java. I agree with all that you wrote and want to contribute several observations: > Records could be safely made cloneable() with automatic support too, but not clear if this is worth it My experience suggests that people rarely want to close case classes in Scala. This is likely because, we provide a bunch of methods with default arguments called `copy`. You can write something like case class A(i: Int, d: Double) val myA = A(1, 2).copy(d = 2) Fields that you don't specify are shallow copied from original one. I'm unsure if you want or can provide similar api but it was very useful in Scala. > Extension Do I understand it right that you are proposing that Records cannot inherit normal classes, while abstract records can be inherited by normal classes? > Some have questioned whether this carries its weight, especially given how Scala doesn't support case-to-case extension. To give some perspective, over years I started to love this limitation. It steers towards a better design where Data does not extend other Data. One area that I want to draw your attention to is code evolution. Due to subtle things in Scala case classes handling, there are some situations when it wasn't possible to replace a case class with seemingly equivalent user written code. While this wasn't an original intention, such implementation artifacts aren't as rare as we wished as Scala compiler grew a lot of behavior specific to case classes. This behavior work unreliably or does not work at all in case you inherit a case class. Is a decision to make class be a record a binding decision over long term? Given that records cannot inherit from normal classes, but normal classes can inherit from records AND names are important part of source level compatibility, I'm torn on abstract records. They are the only proper inheritance available for records in your design, which I feel is a compatibility trap. In case one day someone will need to add something that is not supported by records to what used to be an abstract record, they will have to convert the entire hierarchy including all the children. Sometimes this is infeasible. Thus if you have abstract records that can be inherited, I feel that there might be a need for records to be able to inherit arbitrary classes, and this makes it much harder to provide safe records. Thus I'd like to also suggest to not include abstract records in first implementations. Best, Dmitry On Fri, Mar 16, 2018 at 11:55 AM, Brian Goetz wrote: > There are a number of potentially open details on the design for records. > My inclination is to start with the simplest thing that preserves the > flexibility and expectations we want, and consider opening up later as > necessary. > > One of the biggest issues, which Kevin raised as a must-address issue, is > having sufficient support for precondition validation. Without foreclosing > on the ability to do more later with declarative guards, I think the recent > construction proposal meets the requirement for lightweight enforcement > with minimal or no duplication. I'm hopeful that this bit is "there". > > Our goal all along has been to define records as being ?just macros? for a > finer-grained set of features. Some of these are motivated by boilerplate; > some are motivated by semantics (coupling semantics of API elements to > state.) In general, records will get there first, and then ordinary > classes will get the more general feature, but the default answer for "can > you relax records, so I can use it in this case that almost but doesn't > quite fit" should be "no, but there will probably be a feature coming that > makes that class simpler, wait for that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), > and my current thoughts on these, are outlined below. Comments welcome! > > - Extension. The proposal outlines a notion of abstract record, which > provides a "width subtyped" hierarchy. Some have questioned whether this > carries its weight, especially given how Scala doesn't support case-to-case > extension (some see this as a bug, others as an existence proof.) Records > can implement interfaces. > > - Concrete records are final. Relaxing this adds complexity to the > equality story; I'm not seeing good reasons to do so. > > - Additional constructors. I don't see any reason why additional > constructors are problematic, especially if they are constrained to > delegate to the default constructor (which in turn is made far simpler if > there can be statements ahead of the this() call.) Users may find the lack > of additional constructors to be an arbitrary limitation (and they'd > probably be right.) > > - Static fields. Static fields seem harmless. > > - Additional instance fields. These are a much bigger concern. While the > primary arguments against them are of the "slippery slope" variety, I still > have deep misgivings about supporting unrestricted non-principal instance > fields, and I also haven't found a reasonable set of restrictions that > makes this less risky. I'd like to keep looking for a better story here, > before just caving on this, as I worry doing so will end up biting us in > the back. > > - Mutability and accessibility. I'd like to propose an odd choice here, > which is: fields are final and package (protected for abstract records) by > default, but finality can be explicitly opted out of (non-final) and > accessibility can be explicitly widened (public). > > - Accessors. Perhaps the most controversial aspect is that records are > inherently transparent to read; if something wants to truly encapsulate > state, it's not a record. Records will eventually have pattern > deconstructors, which will expose their state, so we should go out of the > gate with the equivalent. The obvious choice is to expose read accessors > automatically. (These will not be named getXxx; we are not burning the > ill-advised Javabean naming conventions into the language, no matter how > much people think it already is.) The obvious naming choice for these > accessors is fieldName(). No provision for write accessors; that's > bring-your-own. > > - Core methods. Records will get equals, hashCode, and toString. > There's a good argument for making equals/hashCode final (so they can't be > explicitly redeclared); this gives us stronger preservation of the data > invariants that allow us to safely and mechanically snapshot / serialize / > marshal (we'd definitely want this if we ever allowed additional instance > fields.) No reason to suppress override of toString, though. Records could > be safely made cloneable() with automatic support too (like arrays), but > not clear if this is worth it (its darn useful for arrays, though.) I > think the auto-generated getters should be final too; this leaves arrays as > second-class components, but I am not sure that bothers me. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 27 13:04:09 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 27 Mar 2018 09:04:09 -0400 Subject: Records -- current status In-Reply-To: References: Message-ID: > > > Extension > > Do I understand it right that you are proposing that Records cannot > inherit normal classes, while abstract records can be inherited by > normal classes? More restrictive than that: ?- records can extend abstract records, or Object (really, AbstractRecord) ?- that's it. There are two primary drivers why non-abstract records shouldn't be extended by something else (record or not): ?- Identity anomalies.? If S extends R, where R is a record, then ctor composed-with dtor is not an identity on R.? That means if someone does: ??? case R(R_ARGS) -> new R(R_ARGS) they may think they are cloning, but in fact they are decapitating. (Even if S has no additional state over R, it still has typestate that is discarded.) ?- Equality anomalies.? If S can extend R, and S has state that wants to participate in equals/hashCode, and therefore wants to override equals, this may violate the symmetry or transitivity of equals. The current notion of abstract records avoids both of these because it has no equality (abstract records reabstract equals), and has no constructors, so ctor \compose dtor is meaningless on abstract records. > Is a decision to make class be a record a binding decision over long > term? We intend that there should be a source- and binary- compatible refactoring between a record and an equivalent class. The class to record direction is the one that is immediately interesting, since people have source bases full of classes they'd like to turn into records.? This is OK as long as the members the class already has (constructor, equals) conform to the semantics of their record equivalents, and the author is OK with the class being final and transparent. The other direction is likely to be less common, but also important; it is analogous to refactoring an enum to a class that uses the type-safe enum pattern, which happens once in a while when you hit the limits of what you can do with an enum/record.? But given that the goal all along is to have records be "just macros for corresponding classes", this should be doable. Where the decision is binding is when you take on a specific class signature: ??? record Point(int x, int y); If you want to add a `z` to this, you're venturing onto the ice. Existing code may assume the existence of an XY constructor or deconstructor; we can probably handle this via adding additional explicit ctor/dtors.? But existing code may also depend on the behavioral assumption that equality of points is based on x and y. So the record <--> class refactoring is practical, but the record <--> different record refactoring is dicier.? (That said, the sweet spot for records is when they are used within a maintenance boundary, where you can find all the uses.? So this might be OK too.) > Given that records cannot inherit from normal classes, but normal > classes can inherit from records AND > names are important part of source level compatibility, I'm torn on > abstract records. With the more restricted understanding, that normal classes cannot extend records, does this change your position? From gavin.bierman at oracle.com Tue Mar 27 14:17:36 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Tue, 27 Mar 2018 15:17:36 +0100 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> Message-ID: > On 23 Mar 2018, at 20:51, Guy Steele wrote: > > > String s = switch (e) { > case 0 -> break ?foofoo?; > case 1: > if (p == 0) break x; > else { String z = hairy(x); break z+z; } > case 2 -> ?barbar?; > }; > > Now I decide that case 1 has three subcases. So I change the `if` to a statement `switch`. > > String s = switch (e) { > case 0 -> break ?foofoo?; > case 1: > switch (p) { > case 0: break x; > case 1: break x+x; > default: String z = hairy(x); break z+z; > } > case 2 -> ?barbar?; > }; > > FAIL. The inner switch is actually an expression switch, so you just need an extra break: String s = switch (e) { case 0 -> ?foofoo?; case 1: break switch (p) { case 0: break x; case 1: break x+x; default: String z = hairy(x); break z+z; }; case 2 -> ?barbar?; }; > All I?m doing is demonstrating that a common refactoring pattern ?turn the `if` statement into a `switch` statement? has more pitfalls than it used to once we introduce `switch` expressions. Agreed. Gavin -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 27 19:15:24 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 27 Mar 2018 15:15:24 -0400 Subject: Raw string literals -- where we are, how we got here Message-ID: Now that things have largely stabilized with raw string literals, let me summarize where we are, and how we got here. ## The proposal Where we are now is that a raw string literal consists of an opening delimiter which is a sequence of N consecutive backticks, for some N > 0, a body which may contain any characters (including newlines) except for a sequence of N consecutive backticks, and a closing delimiter of N consecutive backticks. Any line-end sequences (CR, LF, CRLF) are normalized to a single newline (LF), and the remainder of the body is treated without any further transformation (including without unicode escape processing), and placed in a String.? No other processing is done on the contents. A raw string literal has type String, just like a traditional string literal, and can be used anywhere an expression of type String can be used (assignment, concatenation, etc.) Examples: ??? String s = `Doesn't have a \n newline character in it`; ??? String ss = `a multi- ??????? line-string`; ??? String sss = ``a string with a single tick (`) character in it``; ??? String ssss = `a string with two ticks (``) in it`; ??? String sssss = `````a string literal with gratuitously many ticks in its delimiter`````; Note that the delimiter need not be _more_ ticks than the longest tick sequence in the body; if the body contains sequences of two ticks and three ticks, it can be delimited by one tick, four ticks, five ticks, etc.? This makes it possible to choose a minimal delimiter that doesn't interfere with the body. ## Design Center The design center for this feature is _raw string literals_.? Not multi-line strings (though this is well handled), not interpolated strings (though this can be considered in the future.)? It turns off all inline escaping, even unicode escaping (which is usually handled by the lexer before the production even sees the characters.) We stay as true as we can to this principle: raw means raw, not 99% raw with a little bit of escaping.? (The single exception is normalizing of carriage control, the absence of which would just be too surprising.) The primary use case addressed by raw string literals are snippets of code from other languages embedded in Java source files.? Here we interpret "languages" broadly; they could be traditional programming languages, specialized languages like regular expressions or SQL, or human languages.? We want that the Java lexing not interfere at all; given a suitable O(1) incantation (picking a non-conflicting delimiter), you can freely cut and paste the foreign string to and from Java.? Being able to do this is not only convenient, but it reduces errors due to hand-mangling the string, and enhances readability because the embedded snippet is free of interference from Java. Choosing raw-ness as a design center leads to a simpler design, which is good, but it also is _more stable_, because it leads us away from the temptation to tweak the rules here and there in ways that might be subjectively attractive, but that further increase the complexity of the feature.? This design choice belies a priority choice: the high-order bit is _no embedding anomalies_. Users don't have to reason about whether they need to hand-mangle a snippet to avoid it being mangled by the compiler or runtime; given a suitable choice of delimiter, there's nothing else to think about.? (IDEs can help with the "writing code" part of this.) The various additional features we might be tempted to put in (special processing for leading or trailing blank lines, leading white space, trimming to markers, etc) can instead be handled via library functionality.? Since raw string literals are Strings, we can further process them with library code -- both JDK code and user code (though methods on String have the advantage that they can be chained, rather than wrapped, which most users will prefer).? Adding new string manipulation features via libraries rather than through the language is easier, can be done by users, and is not constrained by the demands of consistency (you can have seven different trimming methods, each with their own definition of whitespace, if you like), whereas a language feature has to be one-size-fits-all.? Moving this complexity to the library where possible leads to a simpler feature and more choices for users. #### A road not taken We choose to divide the world of string literals first into raw and non-raw literals; from this, multi-line strings falls out for free as we can treat line breaks in the source file as just more raw characters. We could have chosen, instead, to first divide the world into single and multi-line strings, and then into raw and non-raw; this would have left us with four choices (raw single line, raw multi-line, cooked single-line, cooked multi-line.)? This also would have been a defensible position, but seemed to add lexical complexity for little gain. #### The exception that proves the rule The one exception to raw-ness is that we normalize the line terminators to the most common (*nix) choice of a single newline, rather than using the platform-specific line terminator on the system that happens to have compiled the classfile.? The alternative would have just been too surprising. ## Syntax Given that this feature has such a high syntax-to-substance ratio, we should expect more than the usual number of syntax opinions. Let's start with some consequences of our chosen design center. #### No fixed delimiter From the design choice above, it is a forced move to accept variable delimiters.? Otherwise, one cannot represent a string with the delimiter in a raw string, without inventing an escaping mechanism, and subverting our "raw means raw" goal. The "self-embedding test" is not a mere theoretical goal.? Since the snippets we expect to paste into Java source are not randomly chosen strings of characters, but meaningful snippets of some language, the likelihood of wanting to represent a string that contains the chosen delimiter goes up.? Even if you are willing to dismiss "embed Java in Java" as a serious use case (we're not), people also want a familiar delimiter, which means something that looks like the delimiter in other languages, further increasing the chance of collision.? (For example, if we'd picked a fixed triple quote delimiter, then you couldn't embed Groovy or Python code, among others -- surely a real use case).? Fixed delimiters (of any length) and "raw means raw" are not compatible goals, and we choose "raw means raw". The credible options for variable delimiters are using a repeating delimiter sequence (say, any number of ticks), or some sort of user-provided nonce ("here" docs), or both.? Nonces impose a higher congnitive load on readers, and their benefit accrues mostly to corner cases, so the more constrained option of repeating delimiters seems preferable. #### Why not 'just' use triple quotes People's syntax preferences are guided by familiarity, so we should expect suggestions to be biased towards what "similar" languages already do.? So the suggestion of using """triple quotes""" should be expected. We've already discussed how a fixed delimiter is not acceptable. So at a minimum, this would have to be adjusted to "three or more."? While some people find triple quotes natural (or at least familiar), others find it offensively heavyweight.? Neither crowd is going to convince the other. #### But ticks are too light The opposite of the "triple quotes are too heavy" argument is "ticks are too light"; that a single tick is a lightweight character, and could go unnoticed, especially if your monitor hasn't been cleaned for a while.? Unfortunately the quote-like delimiters in the middle of the weight range are taken by other activities.? Again, we can't satisfy the "too light" and "too heavy" crowd at the same time; whichever we do will make some people unhappy. #### Why do you have to always do something new? The quoting scheme chosen -- any number of ticks -- is actually taken from something we all use: Markdown (https://daringfireball.net/projects/markdown/syntax), which permits any number of ticks to be used for infix sequences, and any different number of ticks to be embedded.? (Where we depart from Markdown is that Markdown strips any leading and trailing newlines from multi-line tick blocks, an appropriate trick for a page presentation language, but not consistent with the design goal of "raw".) #### But I want indentation stripping When embedding a snippet of one language in another, both of which support indentation, we are left with two choices: indent the enclosed block exactly, which has the effect of the code "jutting out to the left", or indent the enclosed block relative to the enclosing block, which has the effect of having more indentation than you might want for the enclosed block.? Sometimes this doesn't matter, but sometimes it does. Whatever we do, one of these crowds will be unhappy.? When in doubt, we stick to the principle of "raw means raw", and provide indentation stripping via new instance methods on `String` to allow a range of trimming options, such as `trimIndent()`. #### But I want leading / trailing empty lines Some people would like for the language to strip off leading and trailing blank lines.? Like indentation stripping, this is going to be what people want sometimes, and sometimes not.? And given that again, we can't do both, we again, are guided by "raw means raw", and provide library means to strip the extraneous newlines. #### But I want a marker character to make it obvious Some people would like a margin marker character, so they can manage margins like this: ??? foo(`This is a long string ??????? >the characters up to, and ??????? >including, the bracket are stripped ??????? >by the compiler ??????? >??? and this line is indented`) (Others would argue the marker character should be "|".)? Again, we believe these sorts of transforms are the purview of libraries, not language, and will be provided. #### But people will make ASCII art ??? `````````````````` ??? `Yes, they might.` ??? `````````````````` #### But I want to use unicode escaping There will be library support for explicitly processing Unicode escape sequences, or backslash escape sequences, or both. #### But calling library methods like `longString`.trim() is ugly You say ugly; I say simple and transparent. #### But doing these things in libraries has to be slower and yield more bloated bytecode No, it doesn't. ## Anomalies and puzzlers While the proposed scheme is lexically very simple, it does have some at least one surprising consequence, as well as at least one restriction: ?- The empty string cannot be represented by a raw string literal (because two consecutive ticks will be interpreted as a double-tick delimiter, not a starting and ending delimiter); ?- String containing line delimiters other than \n cannot be represented directly by a raw string literal. The latter anomaly is true for any scheme that is free of embedding anomalies (escaping) and that normalizes newlines.? If we chose to not normalize newlines, we'd arguably have a worse anomaly, which is that the carriage control of a raw string depends on the platform you compiled it on. The empty-string anomaly is scary at first, but, in my opinion, is much less of a concern than the initial surprise makes it appear. Once you learn it, you won't forget it -- and IDEs and compilers will provide feedback that help you learn it.? It is also easily avoided: use traditional string literals unless you have a specific need for raw-ness.? There already is a perfectly valid way to denote the empty string. #### Can't these be fixed? These anomalies can be moved around by tweaking the rules, but the result is going to be more complicated rules and the same number (or more) of anomalies, just in different places -- and sometimes in worse places.? While there is room to subjectively differ on which anomalies are worse than others, we believe that the simplicity of this scheme, and its freedom from embedding anomalies, makes it the winner. Because we start with such a simple rule (any number of consecutive ticks), pretty much any tweak is going to be complexity-increasing.? It seems a poor tradeoff to make the feature more complex and less convenient for everyone, just to cater to empty strings. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Wed Mar 28 13:15:33 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Wed, 28 Mar 2018 14:15:33 +0100 Subject: Expression switch exception naming Message-ID: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> Dear experts, We're busy putting the finishing touches to the spec for expression switches. Here's one issue that came up that we'd like your opinion on. Here is a part of the spec that deals with the dynamic semantics of expression switch - If no pattern matches and there is no `default` pattern label, then a **??** is thrown and the entire expression `switch` completes abruptly for that reason. So, the question is: What shall we call the **??** exception? There are two ways of looking at this: 1. The VM-centric view. Because the compiler checked that the expression switch was exhaustive, this can only have happened because someone has changed something after compilation. Right now that would be because we have added a new enum constant. In this case we could perhaps raise an "EnumConstantNotPresentException" exception. But we are planning to introduce sealed types in a future version, and we could run into a similar error, where clearly the exception name can't refer to an enum. We need something that covers all possible causes. Accordingly, we could go with something like "IncompatibleClassChangeException". Whilst accurate, one might fear that the average Java programmer will not find this an informative exception name. 2. The language-centric view. We should name this exception after the feature that has caused it to be raised. So "MatchFailureException", "PatternFailureException", "UnexpectedMatchException" might be possibilities. A worry here might be that other future features might want to raise this exception, and the name will be less applicable. Perhaps "UnexpectedValueException"? Your thoughts/suggestions much appreciated! Gavin From mark at io7m.com Wed Mar 28 13:39:47 2018 From: mark at io7m.com (Mark Raynsford) Date: Wed, 28 Mar 2018 13:39:47 +0000 Subject: Expression switch exception naming In-Reply-To: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> Message-ID: <20180328133947.37a15e67@copperhead.int.arc7.info> On 2018-03-28T14:15:33 +0100 Gavin Bierman wrote: > > A worry here might be that other future features might want to raise this exception, and the name will be less applicable. Perhaps "UnexpectedValueException"? Is this really a worry? I mean, it's not really as if unchecked exception types are in short supply. I'd be in favour of adding more and more specific exceptions so that instances of them indicated *exactly* what went wrong (rather than "one of these several different language constructs went wrong in some way"). -- Mark Raynsford | http://www.io7m.com From forax at univ-mlv.fr Wed Mar 28 15:19:45 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 28 Mar 2018 17:19:45 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> Message-ID: <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> NoCaseMatchError (or any other name) which is a subtype of IncompatibleClassChangeError. Rational: the compiler already checks that the pattern is exhaustive or emit a compile error, so it means that there is an issue at runtime it's due to the separate compilation problem. So it will be a rare condition, thus it should be an error and not a runtime exception. We already have different subtypes of IncompatibleClassChangeError depending on the exact problem, IllegalAccessError, AbstractMethodError etc, having a case missing should just be another subtype. R?mi ----- Mail original ----- > De: "Gavin Bierman" > ?: "amber-spec-experts" > Envoy?: Mercredi 28 Mars 2018 15:15:33 > Objet: Expression switch exception naming > Dear experts, > > We're busy putting the finishing touches to the spec for expression switches. > Here's one issue that came up that we'd like your opinion on. > > Here is a part of the spec that deals with the dynamic semantics of expression > switch > > - If no pattern matches and there is no `default` pattern label, then > a **??** is thrown and the entire expression `switch` completes abruptly for > that reason. > > > So, the question is: What shall we call the **??** exception? There are two ways > of looking at this: > > 1. The VM-centric view. Because the compiler checked that the expression switch > was exhaustive, this can only have happened because someone has changed > something after compilation. Right now that would be because we have added a > new enum constant. In this case we could perhaps raise an > "EnumConstantNotPresentException" exception. But we are planning to introduce > sealed types in a future version, and we could run into a similar error, where > clearly the exception name can't refer to an enum. We need something that > covers all possible causes. Accordingly, we could go with something like > "IncompatibleClassChangeException". Whilst accurate, one might fear that the > average Java programmer will not find this an informative exception name. > > 2. The language-centric view. We should name this exception after the feature > that has caused it to be raised. So "MatchFailureException", > "PatternFailureException", "UnexpectedMatchException" might be possibilities. A > worry here might be that other future features might want to raise this > exception, and the name will be less applicable. Perhaps > "UnexpectedValueException"? > > > Your thoughts/suggestions much appreciated! > Gavin From kevinb at google.com Wed Mar 28 15:24:55 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Mar 2018 08:24:55 -0700 Subject: Expression switch exception naming In-Reply-To: <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> Message-ID: I'd almost hit send on the very same point... is it correct that this situation is described exactly by "Thrown when an incompatible class change has occurred to some class definition. The definition of some class, on which the currently executing method depends, has since changed."? If that shoe fits, we should wear it, and in naming our new subtype we should attempt consistency with: AbstractMethodError, IllegalAccessError, InstantiationError, NoSuchFieldError, NoSuchMethodError. On Wed, Mar 28, 2018 at 8:19 AM, Remi Forax wrote: > NoCaseMatchError (or any other name) which is a subtype of > IncompatibleClassChangeError. > > Rational: > the compiler already checks that the pattern is exhaustive or emit a > compile error, so it means that there is an issue at runtime it's due to > the separate compilation problem. > So it will be a rare condition, thus it should be an error and not a > runtime exception. > > We already have different subtypes of IncompatibleClassChangeError > depending on the exact problem, IllegalAccessError, AbstractMethodError > etc, having a case missing should just be another subtype. > > R?mi > > ----- Mail original ----- > > De: "Gavin Bierman" > > ?: "amber-spec-experts" > > Envoy?: Mercredi 28 Mars 2018 15:15:33 > > Objet: Expression switch exception naming > > > Dear experts, > > > > We're busy putting the finishing touches to the spec for expression > switches. > > Here's one issue that came up that we'd like your opinion on. > > > > Here is a part of the spec that deals with the dynamic semantics of > expression > > switch > > > > - If no pattern matches and there is no `default` pattern label, then > > a **??** is thrown and the entire expression `switch` completes > abruptly for > > that reason. > > > > > > So, the question is: What shall we call the **??** exception? There are > two ways > > of looking at this: > > > > 1. The VM-centric view. Because the compiler checked that the expression > switch > > was exhaustive, this can only have happened because someone has changed > > something after compilation. Right now that would be because we have > added a > > new enum constant. In this case we could perhaps raise an > > "EnumConstantNotPresentException" exception. But we are planning to > introduce > > sealed types in a future version, and we could run into a similar error, > where > > clearly the exception name can't refer to an enum. We need something that > > covers all possible causes. Accordingly, we could go with something like > > "IncompatibleClassChangeException". Whilst accurate, one might fear > that the > > average Java programmer will not find this an informative exception name. > > > > 2. The language-centric view. We should name this exception after the > feature > > that has caused it to be raised. So "MatchFailureException", > > "PatternFailureException", "UnexpectedMatchException" might be > possibilities. A > > worry here might be that other future features might want to raise this > > exception, and the name will be less applicable. Perhaps > > "UnexpectedValueException"? > > > > > > Your thoughts/suggestions much appreciated! > > Gavin > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 28 15:51:43 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Mar 2018 11:51:43 -0400 Subject: Expression switch exception naming In-Reply-To: <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> Message-ID: <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> > NoCaseMatchError (or any other name) which is a subtype of IncompatibleClassChangeError. I buy the "subtype of ICCE" argument, but it seems to me these need to be exceptions, not errors.? (Thought experiment: if we already had both ICC{Exception,Error}, would we have jumped so fast to Error?? I don't think so.)? I'd support adding ICEException and having these be subtypes. Adding a new enum value is not the same sort of obviously-incompatible change as changing a static method to instance, or a concrete method to abstract, which are the sorts of things that trigger ICCError. On the naming front, I would think this is more in the category of "unexpected class change exception" than "incompatible change." Adding a new enum constant isn't intrinsically evil.? If anything, the issue is on the client, who relied on the assumption of of exhaustiveness. From kevinb at google.com Wed Mar 28 18:06:51 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Mar 2018 11:06:51 -0700 Subject: Expression switch exception naming In-Reply-To: <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> Message-ID: On Wed, Mar 28, 2018 at 8:51 AM, Brian Goetz wrote: > > Adding a new enum value is not the same sort of obviously-incompatible >> change as changing a static method to instance, or a concrete method to >> abstract, which are the sorts of things that trigger ICCError. ... Adding a >> new enum constant isn't intrinsically evil. If anything, the issue is on >> the client, who relied on the assumption of of exhaustiveness. > > Okay, this sentiment is what I'm disagreeing with. I think that what we are doing here is turning that change (add constant to enum) into an incompatible class change, just as much as any of the other kinds. It's directly analogous to adding an interface method. Clients were required to specify how to handle all the methods of that interface, but then one more showed up, and those clients are now broken. Users will find this counter-intuitive, but that's only because they won't be used to it yet. They'll have to learn. More to the point I think: the problem isn't "on" the client code; it's having jars in your runtime classpath that are newer than the jars you compiled against; that's always a dangerous idea and it continues to be so here. Today, an experienced developer knows that there is a category of Errors that, when you see them in the absence of reflection, always implicate this kind of classpath issue. I can't see why this would not belong in that same category. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 28 18:29:25 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Mar 2018 14:29:25 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> Message-ID: ICCE indicates that a _binary incompatible_ change was detected, which by definition the client cannot recover from.? Adding an enum constant is not a binary incompatibility (though removing one is).? (Interestingly, neither is changing an enum to a class, if you have static fields for all the constants -- I learned this recently.) However, adding an enum constant is a potential _behavioral_ incompatibility -- specifically, it could cause exceptions like this to be thrown under some circumstances.? And it is only potential because in order to have a problem, both the enum and the client must each bring a piece of the responsibility -- it is something the client could recover from it if it wished (by having a default).? This is an incompatibility, but also substantially less severe in a few ways than deleting a method.? (Maybe its like a method starting to return null where it never had before.) If a client has a default in their switch, there's no problem.? If a client doesn't have a default, but provides all the known items, the compiler builds in a default that throws, to prevent it from being silently ignored.? That's all good, and things are blowing up in the right place with an informative message.? The behavior incompatibility is an interaction between the enum and the client's use of that enum.? So I think the exception should point as much to the client as the enum. A concrete proposal: ??? UnexpectedClassChangeException <: RuntimeException ??? UnexpectedSwitchTarget <: UCCE? // works both for enum and sealed classes On 3/28/2018 2:06 PM, Kevin Bourrillion wrote: > On Wed, Mar 28, 2018 at 8:51 AM, Brian Goetz > wrote: > > > Adding a new enum value is not the same sort of > obviously-incompatible change as changing a static method to > instance, or a concrete method to abstract, which are the > sorts of things that trigger ICCError. ... Adding a new enum > constant isn't intrinsically evil.? If anything, the issue is > on the client, who relied on the assumption of of exhaustiveness. > > > Okay, this sentiment is what I'm disagreeing with. > > I think that what we are doing here is turning that change (add > constant to enum) into an?incompatible class change, just as much as > any of the other kinds. It's directly analogous to adding an interface > method. Clients were required to specify how to handle all the methods > of that interface, but then one more showed up, and those clients are > now broken. > > Users will find this counter-intuitive, but that's only because they > won't be used to it yet. They'll have to learn. > > More to the point I think: the problem isn't "on" the client code; > it's having jars in your runtime classpath that are newer than the > jars you compiled against; that's always a dangerous idea and it > continues to be so here. Today, an experienced developer knows that > there is a category of Errors that, when you see them in the absence > of reflection, always implicate this kind of classpath issue. I can't > see why this would not belong in that same category. > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Mar 28 18:48:24 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 28 Mar 2018 20:48:24 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> Message-ID: <419693952.1414197.1522262904368.JavaMail.zimbra@u-pem.fr> Given that you can explicitly add a default target, it's an opt-in mechanism, if you add you own 'default' target, you explicitly say how you want to recover to an enum constant you do not known. Using a runtime exception is like another less readable way to do exactly that. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "Gavin Bierman" > Cc: "amber-spec-experts" > Envoy?: Mercredi 28 Mars 2018 17:51:43 > Objet: Re: Expression switch exception naming >> NoCaseMatchError (or any other name) which is a subtype of >> IncompatibleClassChangeError. > > I buy the "subtype of ICCE" argument, but it seems to me these need to > be exceptions, not errors.? (Thought experiment: if we already had both > ICC{Exception,Error}, would we have jumped so fast to Error?? I don't > think so.)? I'd support adding ICEException and having these be subtypes. > > Adding a new enum value is not the same sort of obviously-incompatible > change as changing a static method to instance, or a concrete method to > abstract, which are the sorts of things that trigger ICCError. > > On the naming front, I would think this is more in the category of > "unexpected class change exception" than "incompatible change." Adding a > new enum constant isn't intrinsically evil.? If anything, the issue is > on the client, who relied on the assumption of of exhaustiveness. From brian.goetz at oracle.com Wed Mar 28 18:55:04 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Mar 2018 14:55:04 -0400 Subject: Expression switch exception naming In-Reply-To: <419693952.1414197.1522262904368.JavaMail.zimbra@u-pem.fr> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <419693952.1414197.1522262904368.JavaMail.zimbra@u-pem.fr> Message-ID: <8777323b-206c-7d77-0c2b-c28940cf7c47@oracle.com> > Using a runtime exception is like another less readable way to do exactly that. > > It's both less and more readable. Its less readable in that its implicit in the code, rather than explicit.? But its probably the case that the exception thrown implicitly will be more informative to the user than the hand-written one, which is usually: ??? default: throw new AssertionError("can't get here"); Which is less readable for the poor fellow who had to debug it. From kevinb at google.com Wed Mar 28 19:35:15 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Mar 2018 12:35:15 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> Message-ID: On Wed, Mar 28, 2018 at 11:29 AM, Brian Goetz wrote: ICCE indicates that a _binary incompatible_ change was detected, which by > definition the client cannot recover from. Adding an enum constant is not > a binary incompatibility (though removing one is). (Interestingly, neither > is changing an enum to a class, if you have static fields for all the > constants -- I learned this recently.) > > However, adding an enum constant is a potential _behavioral_ > incompatibility -- specifically, it could cause exceptions like this to be > thrown under some circumstances. And it is only potential because in order > to have a problem, both the enum and the client must each bring a piece of > the responsibility -- it is something the client could recover from it if > it wished (by having a default). This is an incompatibility, but also > substantially less severe in a few ways than deleting a method. (Maybe its > like a method starting to return null where it never had before.) > I have been figuring that if the client *has* a reasonable way to handle unknown values then it will probably go ahead and do that (with a `default`). Therefore I assumed that what we're talking about in this conversation is the* other* kind, where there is nothing safe they can do - for example if I wrote a method that displays a time interval as "10 ns" or "20 s", I may not find it acceptable for me to start displaying "30 " once I get handed TimeUnit.DAYS. My code is broken either way. If a constant is added, I need to react to that, just like I do with a new interface method. What does it really mean to say that this client "brings a piece of the responsibility" if it doesn't really have a choice? (also, I think we could probably provide evidence from our codebase that this is the far more common situation than the first (reasonable default) kind. Note that we have had compile-time enforcement of exhaustiveness for enum switches turned on for a while now, though we haven't done much to goad users into removing their defaults in order to activate it. Liam could say much more about how this is working if desired.) So, I'm not quite yet following why the binary/source compatibility distinction, or the opt-in distinction, really makes all the difference here. > If a client has a default in their switch, there's no problem. If a > client doesn't have a default, but provides all the known items, the > compiler builds in a default that throws, to prevent it from being silently > ignored. That's all good, and things are blowing up in the right place > with an informative message. The behavior incompatibility is an > interaction between the enum and the client's use of that enum. So I think > the exception should point as much to the client as the enum. > > A concrete proposal: > > UnexpectedClassChangeException <: RuntimeException > UnexpectedSwitchTarget <: UCCE // works both for enum and sealed > classes > > > > > > > > On 3/28/2018 2:06 PM, Kevin Bourrillion wrote: > > On Wed, Mar 28, 2018 at 8:51 AM, Brian Goetz > wrote: > >> >> Adding a new enum value is not the same sort of obviously-incompatible >>> change as changing a static method to instance, or a concrete method to >>> abstract, which are the sorts of things that trigger ICCError. ... Adding a >>> new enum constant isn't intrinsically evil. If anything, the issue is on >>> the client, who relied on the assumption of of exhaustiveness. >> >> > Okay, this sentiment is what I'm disagreeing with. > > I think that what we are doing here is turning that change (add constant > to enum) into an incompatible class change, just as much as any of the > other kinds. It's directly analogous to adding an interface method. Clients > were required to specify how to handle all the methods of that interface, > but then one more showed up, and those clients are now broken. > > Users will find this counter-intuitive, but that's only because they won't > be used to it yet. They'll have to learn. > > More to the point I think: the problem isn't "on" the client code; it's > having jars in your runtime classpath that are newer than the jars you > compiled against; that's always a dangerous idea and it continues to be so > here. Today, an experienced developer knows that there is a category of > Errors that, when you see them in the absence of reflection, always > implicate this kind of classpath issue. I can't see why this would not > belong in that same category. > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Wed Mar 28 19:37:51 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 28 Mar 2018 13:37:51 -0600 Subject: Feedback wanted: switch expression typing Message-ID: (Looking for some feedback on real-world code usage. Please read to the end, then if you can experiment with the code you work on and report back, I'd appreciate it!) Switch expressions, from a type checking perspective, are basically generalizations of conditional expressions: instead of 2 operands to check, we have n. A reasonable expectation is that, if I rewrite my conditional expression as a switch expression, it will behave the same: test ? foo() : bar() is equivalent to switch (test) { case true -> foo(); case false -> bar(); } So, as a starting point, the typing rules for switches should be the same as the typing rules for conditionals, but generalized to an arbitrary number of results. (The "results" of a switch expression are all expressions appearing after a '->' or a 'break'.) Conditional expressions and switch expressions are typically used as poly expressions (in a context that has a target type). But that won't always be the case. One notable usage that doesn't have a target type is an initializer for 'var': "var x = ...". So they are sometimes poly expressions, sometimes standalone. Conditional expression typing is driven by an ad hoc categorization scheme which looks at the result expressions and tries to predict whether they will all have type boolean/Boolean, primitive/boxed number, or something else/a mix ("tries to predict" because in some cases we can't type-check the expression until we've completed the categorization). In the numeric case, we then identify the narrowest primitive type that can contain the results. In the other/mixed case, we then type check by pushing down a target type, or, if none is available, producing a reference type from the lub operation. A couple of observations: - The primitive vs. reference choice is meaningful, because the primitive and reference type hierarchies are different (e.g., int can be widened to long, but Integer can't be widened to Long). Preferring primitive typing where possible seems like the right choice. - The ad hoc categorization is a bit of a mess. It's complex and imperfect. What people probably expect is that, where a target type is available, that's what the compiler will use?but the compiler ignores the target type in the primitive cases. Why? Well, in 8, when we introduced target typing of conditionals, we identified some incompatibilities that would occur if we changed the handing of primitives, and we didn't want to be disruptive. Some examples: Boolean x = test ? z : zbox; // specified: can NPE; target typing: no null check Integer x = test ? s : i; // specified: ok; target typing: can't convert short->Integer Number x = test ? s : i; // specified: box to Integer; target typing: box to Short or Integer double d = test ? l : f; // specified: long->float loses precision; target typing: long->double better precision m(test ? z : zbox); // specified: prefers m(boolean); target typing: m(boolean) and m(Boolean) are ambiguous At this point, we've got a choice: A) Fully mimic the conditional behavior in switch expressions B) Do target typing (when available) for all switch expressions, diverging from conditionals C) Do target typing (when available) for all switches and conditionals, accepting the incompatibilities (A) sacrifices simplicity. (B) sacrifices consistency. (C) sacrifices compatibility. General thoughts on simplicity (is the current behavior hard to understand?) and consistency (is it bad if the conditional/switch refactoring leads to subtly different typing?) are welcome. And we could use some clarification is just how significant the compatibility costs of (C) are. With that in mind, here's a javac patch: http://cr.openjdk.java.net/~dlsmith/logPrimitiveConditionals.patch A javac built with this patch supports an option that will output diagnostics wherever conditionals at risk of incompatible change are detected: javac -XDlogPrimitiveConditionals Foo.java If you're able to build OpenJDK with this patch and run it on some real-world code, I'd appreciate any insights about what you find. ?Dan From forax at univ-mlv.fr Wed Mar 28 19:44:20 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 28 Mar 2018 21:44:20 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: <8777323b-206c-7d77-0c2b-c28940cf7c47@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <419693952.1414197.1522262904368.JavaMail.zimbra@u-pem.fr> <8777323b-206c-7d77-0c2b-c28940cf7c47@oracle.com> Message-ID: <1093130344.1420134.1522266260801.JavaMail.zimbra@u-pem.fr> My point was more that if you want an exception and not an error, you can add a default than discussing the merit of implicit vs explicit throwable. The compiler should emit a code that throw an error for unknown enum constant. if the compiler emit a code that throw an exception, people will also be able to catch it, adding another way to react to a missing enum constant. regards, R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Gavin Bierman" , "amber-spec-experts" > Envoy?: Mercredi 28 Mars 2018 20:55:04 > Objet: Re: Expression switch exception naming >> Using a runtime exception is like another less readable way to do exactly that. >> >> > > It's both less and more readable. > > Its less readable in that its implicit in the code, rather than > explicit.? But its probably the case that the exception thrown > implicitly will be more informative to the user than the hand-written > one, which is usually: > > ??? default: throw new AssertionError("can't get here"); > > Which is less readable for the poor fellow who had to debug it. From brian.goetz at oracle.com Wed Mar 28 19:48:58 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Mar 2018 15:48:58 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> Message-ID: <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> > I have been figuring that if the client /has/?a reasonable way to > handle unknown values then it will probably go ahead and do that (with > a `default`). I think that's a fair assumption for your codebase, but not in general.? Developers will surely do this: ??? x = switch (trafficLight) { ??????? case RED -> ... ??????? case YELLOW -> ... ??????? case GREEN -> ... ??? } and leave out a default because they can.? So they get a default default, one that throws.? No problem. The only question here is: what to throw.? My argument is that Error is just too strong an indicator.? (It's like using fatal as your logging level for everything; it would be more useful to use warning for things that aren't fatal). From the Error doc: An|Error|is a subclass of|Throwable|that indicates serious problems that a reasonable application should not try to catch. Most such errors are abnormal conditions. Serious problems mean that underlying VM mechanism have failed. Encountering an unexpected input is not in this category.? Sure, it deserves an exception, but its not an ICCE. > Therefore I assumed that what we're talking about in this conversation > is the/other/ kind, where there is nothing safe they can do - for > example if I wrote a method that displays a time interval as "10 ns" > or "20 s", I may not find it acceptable for me to start displaying "30 > " once I get handed TimeUnit.DAYS. My code is broken > either way. If a constant is added, I need to react to that, just like > I do with a new interface method. What does it really mean to say that > this client "brings a piece of the responsibility" if it doesn't > really have a choice? It's not unlike this: ??? AnEnum e = f(...); ??? switch (e) { ??????? ... ??? } and not being prepared for a null.? You'll get an NPE.? The local code isn't expected to deal with it, but somewhere up the stack, someone is prepared to deal with it, discard the offending incoming work item, log what happened, and re-enter the work loop. > So, I'm not quite yet following why the binary/source compatibility > distinction, or the opt-in distinction, really makes all the > difference here. Some incompatibilities are more of a fire drill than others.? Binary incompatibilities (e.g., removing a method) are harder to recover from than unexpected inputs.? Further, while there may be no good _local_ recover for an unexpected input, there often is a reasonable global recovery.? Error means "fire drill".? I claim this doesn't rise to the level of Error; it's more like NumberFormatException or NPE or ClassCastException. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 28 20:24:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Mar 2018 16:24:16 -0400 Subject: Raw string literals -- where we are, how we got here In-Reply-To: References: Message-ID: My apologies, I left out one additional known limitation: raw string literals that start or end with tick.? There are numerous workarounds, none of which are beautiful, such as: ??? String s = `` `foo` ``.trim(); ??? String s = TICK + `foo` + TICK; > ## Anomalies and puzzlers > > While the proposed scheme is lexically very simple, it does have some > at least one surprising consequence, as well as at least one restriction: > ?- The empty string cannot be represented by a raw string literal > (because two consecutive ticks will be interpreted as a double-tick > delimiter, not a starting and ending delimiter); > ?- String containing line delimiters other than \n cannot be > represented directly by a raw string literal. > > The latter anomaly is true for any scheme that is free of embedding > anomalies (escaping) and that normalizes newlines.? If we chose to not > normalize newlines, we'd arguably have a worse anomaly, which is that > the carriage control of a raw string depends on the platform you > compiled it on. > > The empty-string anomaly is scary at first, but, in my opinion, is > much less of a concern than the initial surprise makes it appear.? > Once you learn it, you won't forget it -- and IDEs and compilers will > provide feedback that help you learn it.? It is also easily avoided: > use traditional string literals unless you have a specific need for > raw-ness.? There already is a perfectly valid way to denote the empty > string. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cushon at google.com Wed Mar 28 22:17:10 2018 From: cushon at google.com (Liam Miller-Cushon) Date: Wed, 28 Mar 2018 22:17:10 +0000 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: Hi Dan, On Wed, Mar 28, 2018 at 12:38 PM Dan Smith wrote: > If you're able to build OpenJDK with this patch and run it on some > real-world code, I'd appreciate any insights about what you find. > I recorded the number of times each diagnostic was produced as a fraction of all conditional expressions in Google's codebase: compiler.note.primitive.conditional.incompatible - 1 in 1400 compiler.note.primitive.conditional.box - 1 in 2400 compiler.note.primitive.conditional.null - 1 in 700 compiler.note.primitive.conditional.precision - 1 in 21000 compiler.note.primitive.conditional.overload - 1 in 6800 We've found that some fraction of the null / boxing cases are actually mistakes where the programmer expected the behaviour they would have gotten from target typing. We've been steering people away from relying on that behaviour with static analysis, so my numbers may under-count those diagnostics relative to other codebases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Wed Mar 28 22:55:00 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 28 Mar 2018 16:55:00 -0600 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: > On Mar 28, 2018, at 4:17 PM, Liam Miller-Cushon wrote: > > Hi Dan, > > On Wed, Mar 28, 2018 at 12:38 PM Dan Smith > wrote: > If you're able to build OpenJDK with this patch and run it on some real-world code, I'd appreciate any insights about what you find. > > I recorded the number of times each diagnostic was produced as a fraction of all conditional expressions in Google's codebase: > > compiler.note.primitive.conditional.incompatible - 1 in 1400 > compiler.note.primitive.conditional.box - 1 in 2400 > compiler.note.primitive.conditional.null - 1 in 700 > compiler.note.primitive.conditional.precision - 1 in 21000 > compiler.note.primitive.conditional.overload - 1 in 6800 > > We've found that some fraction of the null / boxing cases are actually mistakes where the programmer expected the > behaviour they would have gotten from target typing. We've been steering people away from relying on that behaviour > with static analysis, so my numbers may under-count those diagnostics relative to other codebases. Thanks! This is really useful. Subjectively, big picture: how concerned would you be about changing typing rules in these cases? Some followup questions, if you're able to dig into the specific cases and offer a sense of what they look like (if it helps, I could probably also improve the automated detection, with some feedback from you about what you're seeing): 1) The incompatibilities are maybe the biggest concern. And it's not clear that it's helpful for the compiler to reject these sorts of conversions, so maybe we should change the rules. In particular, this is silly: Short s = 0; // fine Long l = 0; // error So: what portion of "primitive.conditional.incompatible" are something other than a literal? Other than a constant expression? 2) Often, the choice of box class doesn't matter (e.g., if printing a Byte/Short/Integer/Long as a string). What portion of "primitive.conditional.box" seem to care about the which box class is chosen? 3) A common pattern for null checking is: Integer ibox2 = (ibox == null) ? 0 : ibox; I'm guessing many of your "primitive.conditional.null" cases look like that. And if not, they're likely to guarantee in the surrounding context that no nulls are present. What portion of these actually seem to need and expect a null pointer check? 4) The overload resolution test casts a somewhat wide net, because actually simulating overload resolution is complicated. So the test is looking for cases in which there are other candidates that would be considered. What portion of these invocations actually appear that they would prompt a different overload choice or an ambiguity? And if the resolved method changes, how often is it a behaviorally significant change (often different overloads have the same behavior)? -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Wed Mar 28 23:13:22 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 28 Mar 2018 17:13:22 -0600 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> Message-ID: > On Mar 23, 2018, at 1:41 PM, Brian Goetz wrote: > > I prefer the simplicity that "all expressions either complete normally or complete abruptly with cause exception." Just want to emphasize that this is a really important property of the language, and of what we mean when we call some things "statements" and other things "expressions". A good overview here: https://docs.oracle.com/javase/specs/jls/se10/html/jls-14.html#jls-14.1 Of course, we can change these definitions. But introducing expressions that can complete abruptly for control flow reasons is a significantly more disruptive change than your typical "add a new kind of expression" feature. ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 28 23:38:30 2018 From: john.r.rose at oracle.com (John Rose) Date: Wed, 28 Mar 2018 16:38:30 -0700 Subject: Disallowing break label (and continue label) inside an expression switch In-Reply-To: References: <431744712.1834159.1520001001216.JavaMail.zimbra@u-pem.fr> <7ba8efde-6641-2d0d-99d5-138caf373721@oracle.com> <3D0957FB-F6AB-4127-A39B-930810788758@oracle.com> <436dfa71-8869-1f69-f75f-bfc2ed3cfdff@oracle.com> Message-ID: <9F3B3460-ECDE-4DD2-A1FD-24805959A75C@oracle.com> On Mar 28, 2018, at 4:13 PM, Dan Smith wrote: > >> On Mar 23, 2018, at 1:41 PM, Brian Goetz > wrote: >> >> I prefer the simplicity that "all expressions either complete normally or complete abruptly with cause exception." > > Just want to emphasize that this is a really important property of the language, and of what we mean when we call some things "statements" and other things "expressions". > > A good overview here: > > https://docs.oracle.com/javase/specs/jls/se10/html/jls-14.html#jls-14.1 > > Of course, we can change these definitions. But introducing expressions that can complete abruptly for control flow reasons is a significantly more disruptive change than your typical "add a new kind of expression" feature. +100 -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Wed Mar 28 23:47:50 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 28 Mar 2018 17:47:50 -0600 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> Message-ID: <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> > On Mar 20, 2018, at 8:15 AM, Brian Goetz wrote: > >> >> So add all this up and we have three kind of finalness for fields: >> >> - by default mutable, but you can change it >> - by default final, and you can't change it >> - (and now) by default final, but you can change it >> >> This seems like quite a bad situation to me. >> > > I think what you are really saying here is: if you want immutable records, wait for value records, don't try to cram them in early? Then a record inherits the finality of the class kind that it is describing. And same with field accessibility. Value records don't support recursion, so are useless for many applications. The sweet spot for records is immutable fields of any type. If the way to express that is to repeat "final" a bunch of times in the declaration, we will have failed. It's a fair point that we are comfortable with "implicitly always final", but "final by default" is a new thing. And if there's a way to describe record-like things that have mutable fields without a 'non-final' keyword, great. But I think we need to spell those things using something other than "record Foo(int x, int y)". ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Mar 29 06:37:26 2018 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 29 Mar 2018 13:37:26 +0700 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: Hello! Existing ?: behavior is certainly surprising and continuous source of puzzlers. E.g. the following was used in our latest Java Puzzlers NG S03 talk: boolean x = false; System.out.println(x ? 42 : null); // prints null System.out.println(x ? 42 : x ? 42 : null); // NPE It's really hard to guess what will happen and even if you know, it's usually hard to explain. Such problems actually cause bugs in real code, I encountered some of them in my practice. Nevertheless I'm for consistent behavior (A or C). This would simplify the understanding of both constructs and also simplify the spec as ?: chapter may just explain the operator as syntactic sugar over switch(condition) { case true -> e1; case false -> e2; } referring to the switch expression chapter for the complete semantics explanation. With best regards, Tagir Valeev. On Thu, Mar 29, 2018 at 2:37 AM, Dan Smith wrote: > (Looking for some feedback on real-world code usage. Please read to the > end, then if you can experiment with the code you work on and report back, > I'd appreciate it!) > > Switch expressions, from a type checking perspective, are basically > generalizations of conditional expressions: instead of 2 operands to check, > we have n. > > A reasonable expectation is that, if I rewrite my conditional expression > as a switch expression, it will behave the same: > > test ? foo() : bar() > is equivalent to > switch (test) { case true -> foo(); case false -> bar(); } > > So, as a starting point, the typing rules for switches should be the same > as the typing rules for conditionals, but generalized to an arbitrary > number of results. > > (The "results" of a switch expression are all expressions appearing after > a '->' or a 'break'.) > > Conditional expressions and switch expressions are typically used as poly > expressions (in a context that has a target type). But that won't always be > the case. One notable usage that doesn't have a target type is an > initializer for 'var': "var x = ...". So they are sometimes poly > expressions, sometimes standalone. > > Conditional expression typing is driven by an ad hoc categorization scheme > which looks at the result expressions and tries to predict whether they > will all have type boolean/Boolean, primitive/boxed number, or something > else/a mix ("tries to predict" because in some cases we can't type-check > the expression until we've completed the categorization). > > In the numeric case, we then identify the narrowest primitive type that > can contain the results. > > In the other/mixed case, we then type check by pushing down a target type, > or, if none is available, producing a reference type from the lub operation. > > A couple of observations: > > - The primitive vs. reference choice is meaningful, because the primitive > and reference type hierarchies are different (e.g., int can be widened to > long, but Integer can't be widened to Long). Preferring primitive typing > where possible seems like the right choice. > > - The ad hoc categorization is a bit of a mess. It's complex and > imperfect. What people probably expect is that, where a target type is > available, that's what the compiler will use?but the compiler ignores the > target type in the primitive cases. > > Why? Well, in 8, when we introduced target typing of conditionals, we > identified some incompatibilities that would occur if we changed the > handing of primitives, and we didn't want to be disruptive. > > Some examples: > Boolean x = test ? z : zbox; // specified: can NPE; target typing: no null > check > Integer x = test ? s : i; // specified: ok; target typing: can't convert > short->Integer > Number x = test ? s : i; // specified: box to Integer; target typing: box > to Short or Integer > double d = test ? l : f; // specified: long->float loses precision; target > typing: long->double better precision > m(test ? z : zbox); // specified: prefers m(boolean); target typing: > m(boolean) and m(Boolean) are ambiguous > > At this point, we've got a choice: > A) Fully mimic the conditional behavior in switch expressions > B) Do target typing (when available) for all switch expressions, diverging > from conditionals > C) Do target typing (when available) for all switches and conditionals, > accepting the incompatibilities > > (A) sacrifices simplicity. (B) sacrifices consistency. (C) sacrifices > compatibility. > > General thoughts on simplicity (is the current behavior hard to > understand?) and consistency (is it bad if the conditional/switch > refactoring leads to subtly different typing?) are welcome. > > And we could use some clarification is just how significant the > compatibility costs of (C) are. With that in mind, here's a javac patch: > > http://cr.openjdk.java.net/~dlsmith/logPrimitiveConditionals.patch > > A javac built with this patch supports an option that will output > diagnostics wherever conditionals at risk of incompatible change are > detected: > > javac -XDlogPrimitiveConditionals Foo.java > > If you're able to build OpenJDK with this patch and run it on some > real-world code, I'd appreciate any insights about what you find. > > ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Thu Mar 29 18:29:26 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 29 Mar 2018 12:29:26 -0600 Subject: Records -- current status In-Reply-To: <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> Message-ID: <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> This reads harsher than I intended, due to me efforts to be brief. So let me elaborate a little: > On Mar 28, 2018, at 5:47 PM, Dan Smith wrote: > >> On Mar 20, 2018, at 8:15 AM, Brian Goetz > wrote: >> >>> >>> So add all this up and we have three kind of finalness for fields: >>> >>> - by default mutable, but you can change it >>> - by default final, and you can't change it >>> - (and now) by default final, but you can change it >>> >>> This seems like quite a bad situation to me. >>> >> >> I think what you are really saying here is: if you want immutable records, wait for value records, don't try to cram them in early? Then a record inherits the finality of the class kind that it is describing. And same with field accessibility. > > Value records don't support recursion, so are useless for many applications. > > The sweet spot for records is immutable fields of any type. If the way to express that is to repeat "final" a bunch of times in the declaration, we will have failed. To define "failed" more clearly: sure people will still use Java, they'll happily use the feature, things will be fine. But we'll be asking every record user to pay a tax (a handful of of "final" keywords) to accommodate the few weirdos who want mutable records. Underlying this is my expectation that most record declarations will be short (many one-liners), and most will not need mutable fields. The "non-final" keyword is a reasonable way to solve this problem without asking everyone to pay a tax. But Kevin's critique is also reasonable. > It's a fair point that we are comfortable with "implicitly always final", but "final by default" is a new thing. And if there's a way to describe record-like things that have mutable fields without a 'non-final' keyword, great. But I think we need to spell those things using something other than "record Foo(int x, int y)". What else could we do? Don't take these random ideas too seriously, but: maybe the declaration is a "mutable record"? Or just a "class", with some other signal that many record-like features are relevant? Or maybe the mutable fields appear in a different context? I feel like we could probably come up with something reasonable if we felt that final by default with a "non-final" opt-in is too confusing. ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 29 18:39:09 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Mar 2018 14:39:09 -0400 Subject: Records -- current status In-Reply-To: <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> Message-ID: <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> > What else could we do? Don't take these random ideas too seriously, > but: maybe the declaration is a "mutable record"? Or just a "class", > with some other signal that many record-like features are relevant? Or > maybe the mutable fields appear in a different context? > > I feel like we could probably come up with something reasonable if we > felt that final by default with a "non-final" opt-in is too confusing. I'm OK with finding other ways to do this than "non-final", though I think its quite likely that the "non-*" convention will muscle its way in at some point anyway (to name one example, classes that would be sealed by default will need a way to say "not sealed"), so I don't want to put too much stock in keyword-sticker-shock-avoidance.? (I actually think non-final is a pretty good answer here; no one will be confused the first time they see it (they'll just bikeshed that it should have been spelled ?table" or something like that.)) I'm less OK with saying "let's do immutable records now, and then figure out the mutability story." While some of the goodies for records will eventually filter down in some form to classes (e.g., better ways to fill in the obvious defaults in constructors, better ways to declare equals/hashCode), I also don't really want to count on that; I'd like to do a complete record feature and then select the bits we want to transplant to classes. I guess the question that this particular sub-thread is looking for an answer to is, which we dislike less: having to say final a lot, or having a new and different default for mutability of record fields.? (Or something else.) From daniel.smith at oracle.com Thu Mar 29 18:43:37 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 29 Mar 2018 12:43:37 -0600 Subject: Records -- current status In-Reply-To: <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: <946A0078-53D7-407A-AC38-2A16C1743341@oracle.com> > On Mar 29, 2018, at 12:39 PM, Brian Goetz wrote: > > I'm less OK with saying "let's do immutable records now, and then figure out the mutability story." Yes, agreed. We should have a mutability story out of the gate, and I buy that "sorry, you're out of luck" is not the right story. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 29 19:25:56 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 29 Mar 2018 15:25:56 -0400 Subject: Records -- current status In-Reply-To: <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: I've always assumed that, eventually, if we wanted too, we could always spell it as !final That also makes possible such explicit declarations as !volatile and !transient . . . And possibly even !null. !final !null String x = "quux"; // mutable, but you can't set it to null ?Guy Sent from my iPhone > On Mar 29, 2018, at 2:39 PM, Brian Goetz wrote: > > >> What else could we do? Don't take these random ideas too seriously, but: maybe the declaration is a "mutable record"? Or just a "class", with some other signal that many record-like features are relevant? Or maybe the mutable fields appear in a different context? >> >> I feel like we could probably come up with something reasonable if we felt that final by default with a "non-final" opt-in is too confusing. > > I'm OK with finding other ways to do this than "non-final", though I think its quite likely that the "non-*" convention will muscle its way in at some point anyway (to name one example, classes that would be sealed by default will need a way to say "not sealed"), so I don't want to put too much stock in keyword-sticker-shock-avoidance. (I actually think non-final is a pretty good answer here; no one will be confused the first time they see it (they'll just bikeshed that it should have been spelled ?table" or something like that.)) > > I'm less OK with saying "let's do immutable records now, and then figure out the mutability story." > > While some of the goodies for records will eventually filter down in some form to classes (e.g., better ways to fill in the obvious defaults in constructors, better ways to declare equals/hashCode), I also don't really want to count on that; I'd like to do a complete record feature and then select the bits we want to transplant to classes. > > I guess the question that this particular sub-thread is looking for an answer to is, which we dislike less: having to say final a lot, or having a new and different default for mutability of record fields. (Or something else.) > From brian.goetz at oracle.com Thu Mar 29 19:27:56 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Mar 2018 15:27:56 -0400 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: And, then when we have non-nullable types, you can write: ??? void foo(!final String! s) and users will think its a few form of brackets. On 3/29/2018 3:25 PM, Guy Steele wrote: > I've always assumed that, eventually, if we wanted too, we could always spell it as > > !final > > That also makes possible such explicit declarations as !volatile and !transient . . . And possibly even !null. > > !final !null String x = "quux"; // mutable, but you can't set it to null > > ?Guy > > Sent from my iPhone > >> On Mar 29, 2018, at 2:39 PM, Brian Goetz wrote: >> >> >>> What else could we do? Don't take these random ideas too seriously, but: maybe the declaration is a "mutable record"? Or just a "class", with some other signal that many record-like features are relevant? Or maybe the mutable fields appear in a different context? >>> >>> I feel like we could probably come up with something reasonable if we felt that final by default with a "non-final" opt-in is too confusing. >> I'm OK with finding other ways to do this than "non-final", though I think its quite likely that the "non-*" convention will muscle its way in at some point anyway (to name one example, classes that would be sealed by default will need a way to say "not sealed"), so I don't want to put too much stock in keyword-sticker-shock-avoidance. (I actually think non-final is a pretty good answer here; no one will be confused the first time they see it (they'll just bikeshed that it should have been spelled ?table" or something like that.)) >> >> I'm less OK with saying "let's do immutable records now, and then figure out the mutability story." >> >> While some of the goodies for records will eventually filter down in some form to classes (e.g., better ways to fill in the obvious defaults in constructors, better ways to declare equals/hashCode), I also don't really want to count on that; I'd like to do a complete record feature and then select the bits we want to transplant to classes. >> >> I guess the question that this particular sub-thread is looking for an answer to is, which we dislike less: having to say final a lot, or having a new and different default for mutability of record fields. (Or something else.) >> From forax at univ-mlv.fr Thu Mar 29 20:00:13 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 29 Mar 2018 22:00:13 +0200 (CEST) Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: <561682784.1852687.1522353613047.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Jeudi 29 Mars 2018 21:25:56 > Objet: Re: Records -- current status > I've always assumed that, eventually, if we wanted too, we could always spell it > as > > !final > > That also makes possible such explicit declarations as !volatile and !transient > . . . And possibly even !null. > > !final !null String x = "quux"; // mutable, but you can't set it to null > > ?Guy !final is a property of the variable while !null is more a property of the type. I see no problem to have a keyword 'mut' like in Rust instead of !final. But before i think we need to have a way to reverse the defaults or at least spell them by example at module declaration. default field final, parameter final, variable mut; default ref nonnull; R?mi > > Sent from my iPhone > >> On Mar 29, 2018, at 2:39 PM, Brian Goetz wrote: >> >> >>> What else could we do? Don't take these random ideas too seriously, but: maybe >>> the declaration is a "mutable record"? Or just a "class", with some other >>> signal that many record-like features are relevant? Or maybe the mutable fields >>> appear in a different context? >>> >>> I feel like we could probably come up with something reasonable if we felt that >>> final by default with a "non-final" opt-in is too confusing. >> >> I'm OK with finding other ways to do this than "non-final", though I think its >> quite likely that the "non-*" convention will muscle its way in at some point >> anyway (to name one example, classes that would be sealed by default will need >> a way to say "not sealed"), so I don't want to put too much stock in >> keyword-sticker-shock-avoidance. (I actually think non-final is a pretty good >> answer here; no one will be confused the first time they see it (they'll just >> bikeshed that it should have been spelled ?table" or something like that.)) >> >> I'm less OK with saying "let's do immutable records now, and then figure out the >> mutability story." >> >> While some of the goodies for records will eventually filter down in some form >> to classes (e.g., better ways to fill in the obvious defaults in constructors, >> better ways to declare equals/hashCode), I also don't really want to count on >> that; I'd like to do a complete record feature and then select the bits we want >> to transplant to classes. >> >> I guess the question that this particular sub-thread is looking for an answer to >> is, which we dislike less: having to say final a lot, or having a new and >> different default for mutability of record fields. (Or something else.) From kevinb at google.com Thu Mar 29 20:59:14 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 29 Mar 2018 13:59:14 -0700 Subject: Records -- current status In-Reply-To: <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: I hope we would be very reluctant to start introducing keywords that contain punctuation ("non-final"). it's never been done and would likely confuse any number of tools for a while. I somewhat like (gut-level) the idea of a single modifier on the record itself that reverses the default for all the fields at once... it emphasizes that the entire thing is becoming a mutable record, even if you put final back onto some of the fields. On Thu, Mar 29, 2018 at 11:39 AM, Brian Goetz wrote: > > What else could we do? Don't take these random ideas too seriously, but: >> maybe the declaration is a "mutable record"? Or just a "class", with some >> other signal that many record-like features are relevant? Or maybe the >> mutable fields appear in a different context? >> >> I feel like we could probably come up with something reasonable if we >> felt that final by default with a "non-final" opt-in is too confusing. >> > > I'm OK with finding other ways to do this than "non-final", though I think > its quite likely that the "non-*" convention will muscle its way in at some > point anyway (to name one example, classes that would be sealed by default > will need a way to say "not sealed"), so I don't want to put too much stock > in keyword-sticker-shock-avoidance. (I actually think non-final is a > pretty good answer here; no one will be confused the first time they see it > (they'll just bikeshed that it should have been spelled ?table" or > something like that.)) > > I'm less OK with saying "let's do immutable records now, and then figure > out the mutability story." > > While some of the goodies for records will eventually filter down in some > form to classes (e.g., better ways to fill in the obvious defaults in > constructors, better ways to declare equals/hashCode), I also don't really > want to count on that; I'd like to do a complete record feature and then > select the bits we want to transplant to classes. > > I guess the question that this particular sub-thread is looking for an > answer to is, which we dislike less: having to say final a lot, or having a > new and different default for mutability of record fields. (Or something > else.) > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 29 21:15:06 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Mar 2018 17:15:06 -0400 Subject: Records -- current status In-Reply-To: References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> Message-ID: <5000593b-2eab-b812-a2a2-14d0e0927f33@oracle.com> The idea of factoring out the defaults somewhere close by isn't intrinsically objectionable (though the suggestion to factor them all the way into the module descriptor is horrible), but it makes me unhappy for a different reason -- it creates the impression that there are two kinds of records, mutable and regular, with separate semantics.? That creates a proliferation of new concepts in the users mind, and worse, in ours -- which makes it more likely the semantics of mutable vs plain records will diverge eventually.? A modifier on the field, on the other hand, is something the user already understands, especially when it is something as self-explanatory as `mutable` or `nonfinal` or `non-final`. I would like it to be clear that there is one kind of record. (Ideally it deals well enough with both final and nonfinal fields, perhaps favoring one over the other.) On the topic of how to spell "non-final", let's keep these in mind: ?- There *will* be other negation keywords coming.? So a regularized way to express it makes future decisions easier and reduces the perceived cost of new keywords that are just the negation of old keywords. ?- It may be tempting to spell it "mutable", but that only describes one meaning of final, and wouldn't do well for the others (final classes and final methods.) (FWIW, non-final is considerably *easier* for the parser to handle than "nonfinal" -- because "-" is already a token and "final" is already a keyword.? Depending on where in the grammar a new contextual keyword is allowed, one may have to jump through unpleasant hoops.? But non-final poses relatively little problem, at least for our compiler.? This is a strong point in favor of the non-keyword scheme.) On 3/29/2018 4:59 PM, Kevin Bourrillion wrote: > I somewhat like (gut-level) the idea of a single modifier on the > record itself that reverses the default for all the fields at once... > it emphasizes that the entire thing is becoming a mutable record, even > if you put final back onto some of the fields. From brian.goetz at oracle.com Thu Mar 29 21:39:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Mar 2018 17:39:21 -0400 Subject: Records -- current status In-Reply-To: <5000593b-2eab-b812-a2a2-14d0e0927f33@oracle.com> References: <9be5a2ff-1839-c0d4-f502-68826fafa8c0@oracle.com> <84CF3300-16A2-4BC0-B0A7-AA17C497B335@oracle.com> <373E817C-6D19-46F4-BFF7-B8FC6E61D9FA@oracle.com> <5687a930-e028-b6ce-0413-1ed8b28ccdc4@oracle.com> <5000593b-2eab-b812-a2a2-14d0e0927f33@oracle.com> Message-ID: <69233fbf-05c7-1204-edaa-fa208b11b8e2@oracle.com> In any case, I think we've wedged ourselves pretty deep into a rathole.? We're now simultaneously discussing: ?- whether records should support mutability at all ?- if so, what semantic differences might mutable records have, if any ?- if so, do we want to nudge towards immutability? ?? - is it OK to have different defaults for records as for classes? - where to put mutability annotations ?- what keywords should be used Clearly that's a lot to cover at once.? Can we set the syntax issues aside until the first two are in hand? Let's roll back to my mail of 3/23 where I talked about why we're going to be screwed if we don't engage mutability constructively... On 3/29/2018 5:15 PM, Brian Goetz wrote: > The idea of factoring out the defaults somewhere close by isn't > intrinsically objectionable (though the suggestion to factor them all > the way into the module descriptor is horrible), but it makes me > unhappy for a different reason -- it creates the impression that there > are two kinds of records, mutable and regular, with separate > semantics.? That creates a proliferation of new concepts in the users > mind, and worse, in ours -- which makes it more likely the semantics > of mutable vs plain records will diverge eventually.? A modifier on > the field, on the other hand, is something the user already > understands, especially when it is something as self-explanatory as > `mutable` or `nonfinal` or `non-final`. > > I would like it to be clear that there is one kind of record. (Ideally > it deals well enough with both final and nonfinal fields, perhaps > favoring one over the other.) > > On the topic of how to spell "non-final", let's keep these in mind: > ?- There *will* be other negation keywords coming.? So a regularized > way to express it makes future decisions easier and reduces the > perceived cost of new keywords that are just the negation of old > keywords. > ?- It may be tempting to spell it "mutable", but that only describes > one meaning of final, and wouldn't do well for the others (final > classes and final methods.) > > (FWIW, non-final is considerably *easier* for the parser to handle > than "nonfinal" -- because "-" is already a token and "final" is > already a keyword.? Depending on where in the grammar a new contextual > keyword is allowed, one may have to jump through unpleasant hoops.? > But non-final poses relatively little problem, at least for our > compiler.? This is a strong point in favor of the non-keyword scheme.) > > > On 3/29/2018 4:59 PM, Kevin Bourrillion wrote: >> I somewhat like (gut-level) the idea of a single modifier on the >> record itself that reverses the default for all the fields at once... >> it emphasizes that the entire thing is becoming a mutable record, >> even if you put final back onto some of the fields. > From cushon at google.com Fri Mar 30 00:47:21 2018 From: cushon at google.com (Liam Miller-Cushon) Date: Fri, 30 Mar 2018 00:47:21 +0000 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: On Wed, Mar 28, 2018 at 3:55 PM Dan Smith wrote: > Subjectively, big picture: how concerned would you be about changing > typing rules in these cases? > My initial impression is that the compatibility impact of (C) would be manageable, especially with the change you mentioned to allow e.g. `Long j = flag ? jbox : 0`. The cases where existing code would be rejected look rare, and should be easy to understand and fix. I'm more concerned about situations where it would change the outcome of overload resolution, since existing code could be accepted but have subtly different behaviour, but those cases appear to be even rarer. As an aside, it might be valuable to have tools to help programmers prepare for this kind of change. For example I could imagine providing a refactoring that suggested fixes to some of the common incompatibilities, and it would be helpful if javac could warn about cases where target typing caused a different overload to be selected. > 1) The incompatibilities are maybe the biggest concern. And it's not clear > that it's helpful for the compiler to reject these sorts of conversions, so > maybe we should change the rules. > > In particular, this is silly: > Short s = 0; // fine > Long l = 0; // error > > So: what portion of "primitive.conditional.incompatible" are something > other than a literal? Other than a constant expression? > 92% of them are constants, <2% of those constants are non-literal constant expressions. > 2) Often, the choice of box class doesn't matter (e.g., if printing a > Byte/Short/Integer/Long as a string). What portion of > "primitive.conditional.box" seem to care about the which box class is > chosen? > I surveyed a sample of these. 60-80% immediately converted the expression to a string (e.g. log statements, String.format), and <10% of the samples cared about which box class was chosen. (The remainder weren't obviously in either category and I haven't investigated them yet.) In some of the cases that care which box is chosen, the change causes a different overload to be selected. > 3) A common pattern for null checking is: > > Integer ibox2 = (ibox == null) ? 0 : ibox; > > I'm guessing many of your "primitive.conditional.null" cases look like > that. And if not, they're likely to guarantee in the surrounding context > that no nulls are present. What portion of these actually seem to need and > expect a null pointer check? > I surveyed a sample of these. >95% of them are doing explicit null handling and don't require an implicit check, and most of those were trivial variations on `x != null ? x : defaultValue` I didn't find any cases where the implicit null check was expected. I may have missed some, but it appears to be rare. > 4) The overload resolution test casts a somewhat wide net, because > actually simulating overload resolution is complicated. So the test is > looking for cases in which there are other candidates that would be > considered. What portion of these invocations actually appear that they > would prompt a different overload choice or an ambiguity? And if the > resolved method changes, how often is it a behaviorally significant change > (often different overloads have the same behavior)? > The most common overloads reported by the diagnostic were String.valueOf, assertEquals, log, StringBuilder#append, and PrintStream#println. I audited some of the less common overloads, and all of them appeared to be 'well behaved' (selecting either one would have resulted in equivalent behaviour). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 30 00:58:13 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Mar 2018 20:58:13 -0400 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: <2C240A00-521C-47DB-8C6D-3980BBB938F7@oracle.com> There?s a middle ground, which is: to pursue B now, in conjunction with trying to evolve the rules for conditionals over time by proceeding from warning to error. That gets us to a consistent and simpler place in the long run while mitigating the effect of a sudden inconsistent change. > On Mar 28, 2018, at 3:37 PM, Dan Smith wrote: > > (Looking for some feedback on real-world code usage. Please read to the end, then if you can experiment with the code you work on and report back, I'd appreciate it!) > > Switch expressions, from a type checking perspective, are basically generalizations of conditional expressions: instead of 2 operands to check, we have n. > > A reasonable expectation is that, if I rewrite my conditional expression as a switch expression, it will behave the same: > > test ? foo() : bar() > is equivalent to > switch (test) { case true -> foo(); case false -> bar(); } > > So, as a starting point, the typing rules for switches should be the same as the typing rules for conditionals, but generalized to an arbitrary number of results. > > (The "results" of a switch expression are all expressions appearing after a '->' or a 'break'.) > > Conditional expressions and switch expressions are typically used as poly expressions (in a context that has a target type). But that won't always be the case. One notable usage that doesn't have a target type is an initializer for 'var': "var x = ...". So they are sometimes poly expressions, sometimes standalone. > > Conditional expression typing is driven by an ad hoc categorization scheme which looks at the result expressions and tries to predict whether they will all have type boolean/Boolean, primitive/boxed number, or something else/a mix ("tries to predict" because in some cases we can't type-check the expression until we've completed the categorization). > > In the numeric case, we then identify the narrowest primitive type that can contain the results. > > In the other/mixed case, we then type check by pushing down a target type, or, if none is available, producing a reference type from the lub operation. > > A couple of observations: > > - The primitive vs. reference choice is meaningful, because the primitive and reference type hierarchies are different (e.g., int can be widened to long, but Integer can't be widened to Long). Preferring primitive typing where possible seems like the right choice. > > - The ad hoc categorization is a bit of a mess. It's complex and imperfect. What people probably expect is that, where a target type is available, that's what the compiler will use?but the compiler ignores the target type in the primitive cases. > > Why? Well, in 8, when we introduced target typing of conditionals, we identified some incompatibilities that would occur if we changed the handing of primitives, and we didn't want to be disruptive. > > Some examples: > Boolean x = test ? z : zbox; // specified: can NPE; target typing: no null check > Integer x = test ? s : i; // specified: ok; target typing: can't convert short->Integer > Number x = test ? s : i; // specified: box to Integer; target typing: box to Short or Integer > double d = test ? l : f; // specified: long->float loses precision; target typing: long->double better precision > m(test ? z : zbox); // specified: prefers m(boolean); target typing: m(boolean) and m(Boolean) are ambiguous > > At this point, we've got a choice: > A) Fully mimic the conditional behavior in switch expressions > B) Do target typing (when available) for all switch expressions, diverging from conditionals > C) Do target typing (when available) for all switches and conditionals, accepting the incompatibilities > > (A) sacrifices simplicity. (B) sacrifices consistency. (C) sacrifices compatibility. > > General thoughts on simplicity (is the current behavior hard to understand?) and consistency (is it bad if the conditional/switch refactoring leads to subtly different typing?) are welcome. > > And we could use some clarification is just how significant the compatibility costs of (C) are. With that in mind, here's a javac patch: > > http://cr.openjdk.java.net/~dlsmith/logPrimitiveConditionals.patch > > A javac built with this patch supports an option that will output diagnostics wherever conditionals at risk of incompatible change are detected: > > javac -XDlogPrimitiveConditionals Foo.java > > If you're able to build OpenJDK with this patch and run it on some real-world code, I'd appreciate any insights about what you find. > > ?Dan From kevinb at google.com Fri Mar 30 15:09:12 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 30 Mar 2018 08:09:12 -0700 Subject: Expression switch exception naming In-Reply-To: <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> Message-ID: I think my overarching point is still this one: "Today, an experienced developer knows that there is a category of Errors that, when you see them in the absence of reflection, always implicate this kind of classpath issue. I can't see why this would not belong in that same category." The distinction, when a stack trace has just ruined my day, of whether I need to start thinking hard about what *real* mistake I might have made *in my code, *or whether I probably just have Class Path Insanity I should check out first, seems to be like a very high order distinction - more useful to illuminate than other various distinctions we can make. On Wed, Mar 28, 2018 at 12:48 PM, Brian Goetz wrote: > > > I have been figuring that if the client *has* a reasonable way to handle > unknown values then it will probably go ahead and do that (with a > `default`). > > > I think that's a fair assumption for your codebase, but not in general. > Developers will surely do this: > > x = switch (trafficLight) { > case RED -> ... > case YELLOW -> ... > case GREEN -> ... > } > > and leave out a default because they can. So they get a default default, > one that throws. No problem. > > The only question here is: what to throw. My argument is that Error is > just too strong an indicator. (It's like using fatal as your logging level > for everything; it would be more useful to use warning for things that > aren't fatal). > > From the Error doc: > > An Error is a subclass of Throwable that indicates serious problems that > a reasonable application should not try to catch. Most such errors are > abnormal conditions. > > Serious problems mean that underlying VM mechanism have failed. > Encountering an unexpected input is not in this category. Sure, it > deserves an exception, but its not an ICCE. > > Therefore I assumed that what we're talking about in this conversation is > the* other* kind, where there is nothing safe they can do - for example > if I wrote a method that displays a time interval as "10 ns" or "20 s", I > may not find it acceptable for me to start displaying "30 " > once I get handed TimeUnit.DAYS. My code is broken either way. If a > constant is added, I need to react to that, just like I do with a new > interface method. What does it really mean to say that this client "brings > a piece of the responsibility" if it doesn't really have a choice? > > > It's not unlike this: > > AnEnum e = f(...); > switch (e) { > ... > } > > and not being prepared for a null. You'll get an NPE. The local code > isn't expected to deal with it, but somewhere up the stack, someone is > prepared to deal with it, discard the offending incoming work item, log > what happened, and re-enter the work loop. > > So, I'm not quite yet following why the binary/source compatibility > distinction, or the opt-in distinction, really makes all the difference > here. > > > Some incompatibilities are more of a fire drill than others. Binary > incompatibilities (e.g., removing a method) are harder to recover from than > unexpected inputs. Further, while there may be no good _local_ recover for > an unexpected input, there often is a reasonable global recovery. Error > means "fire drill". I claim this doesn't rise to the level of Error; it's > more like NumberFormatException or NPE or ClassCastException. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 30 15:35:03 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 11:35:03 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> Message-ID: <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> I think we are the heart of the disagreement.? I do not think that "someone added an enum constant" is necessarily a classpath insanity thing.? Its just as likely to be a garden-variety "you made one assumption over here and another over there" within the same codebase.? (And, we've told people for years that its OK to add enum constants.) I think its at least as likely that this is in the same category as any other assumption we make about code we call.? Like "this method never returns null".? Or "this method says it returns Object, but I know it always returns String, so I'll cast to String." So, it seems that the purpose of this exception is not to cast blame on the classpath, but to pinpoint the erroneous assumption. Maybe Y shouldn't have gotten new enum constants.? Maybe X made bad assumptions about the future maintenance trajectory of Y. Either way, X and Y have to get together and get their story straight.? So the exception should broker that meeting.? And its seems at least as likely as not that X and Y are the same maintainer. If you got a stack trace that said: ??? UnexpectedEnumValueException: surprising enum constant TrafficLight.BLUE; expected one of RED, YELLOW, GREEN ??????? at: MyClass.switchOnTrafficLightColors (line 666) or ??? UnexpectedSealedTypeMemberException: surprising subtype Blue of sealed type TrafficLightColors, expected one of Red, Yellow, Green doesn't that meet your requirement?? You look at the exception, and it says that a value was found that violated an assumption in your code.? It could either be the fault of the enum maintainer or of the switch maintainer (who might well be the same person.) On 3/30/2018 11:09 AM, Kevin Bourrillion wrote: > I think my overarching point is still this one: > > "Today, an experienced developer knows that there is a category of > Errors that, when you see them in the absence of reflection, > always implicate this kind of classpath issue. I can't see why > this would not belong in that same category." > > > The distinction, when a stack trace has just ruined my day, of whether > I need to start thinking hard about what /real/?mistake I might have > made /in my code, /or whether I probably just have Class Path Insanity > I should check out first, seems to be like a very high order > distinction - more useful to illuminate than other various > distinctions we can make. > > > > On Wed, Mar 28, 2018 at 12:48 PM, Brian Goetz > wrote: > > > >> I have been figuring that if the client /has/?a reasonable way to >> handle unknown values then it will probably go ahead and do that >> (with a `default`). > > I think that's a fair assumption for your codebase, but not in > general.? Developers will surely do this: > > ??? x = switch (trafficLight) { > ??????? case RED -> ... > ??????? case YELLOW -> ... > ??????? case GREEN -> ... > ??? } > > and leave out a default because they can.? So they get a default > default, one that throws.? No problem. > > The only question here is: what to throw.? My argument is that > Error is just too strong an indicator.? (It's like using fatal as > your logging level for everything; it would be more useful to use > warning for things that aren't fatal). > > From the Error doc: > > An|Error|is a subclass of|Throwable|that indicates serious > problems that a reasonable application should not try to catch. > Most such errors are abnormal conditions. > > Serious problems mean that underlying VM mechanism have failed.? > Encountering an unexpected input is not in this category.? Sure, > it deserves an exception, but its not an ICCE. > >> Therefore I assumed that what we're talking about in this >> conversation is the/other/ kind, where there is nothing safe they >> can do - for example if I wrote a method that displays a time >> interval as "10 ns" or "20 s", I may not find it acceptable for >> me to start displaying "30 " once I get handed >> TimeUnit.DAYS. My code is broken either way. If a constant is >> added, I need to react to that, just like I do with a new >> interface method. What does it really mean to say that this >> client "brings a piece of the responsibility" if it doesn't >> really have a choice? > > It's not unlike this: > > ??? AnEnum e = f(...); > ??? switch (e) { > ??????? ... > ??? } > > and not being prepared for a null.? You'll get an NPE. The local > code isn't expected to deal with it, but somewhere up the stack, > someone is prepared to deal with it, discard the offending > incoming work item, log what happened, and re-enter the work loop. > >> So, I'm not quite yet following why the binary/source >> compatibility distinction, or the opt-in distinction, really >> makes all the difference here. > > Some incompatibilities are more of a fire drill than others.? > Binary incompatibilities (e.g., removing a method) are harder to > recover from than unexpected inputs.? Further, while there may be > no good _local_ recover for an unexpected input, there often is a > reasonable global recovery.? Error means "fire drill".? I claim > this doesn't rise to the level of Error; it's more like > NumberFormatException or NPE or ClassCastException. > > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cushon at google.com Fri Mar 30 16:05:59 2018 From: cushon at google.com (Liam Miller-Cushon) Date: Fri, 30 Mar 2018 16:05:59 +0000 Subject: Expression switch exception naming In-Reply-To: <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: On Fri, Mar 30, 2018 at 8:35 AM Brian Goetz wrote: > I think we are the heart of the disagreement. I do not think that > "someone added an enum constant" is necessarily a classpath insanity > thing. Its just as likely to be a garden-variety "you made one assumption > over here and another over there" within the same codebase. (And, we've > told people for years that its OK to add enum constants.) > Won't the garden-variety differences in assumptions normally be reported at compile-time? If the codebase is recompiled after the enum is updated, I would expect javac to reject an e-switch over an enum type that does not explicitly handle all enum constants and omits an explicit default case. For the exception being discussed to be thrown, the enum has to change after the code switching on it has been compiled, and then the switch has to be executed without having been recompiled. That skew between the classpath the switch was compiled against and the classpath the switch runs against is the 'classpath insanity' part. > I think its at least as likely that this is in the same category as any > other assumption we make about code we call. Like "this method never > returns null". Or "this method says it returns Object, but I know it > always returns String, so I'll cast to String." > I see a distinction here in that there's no way to enforce those other assumptions at compile-time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 30 16:07:09 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 30 Mar 2018 09:07:09 -0700 Subject: Expression switch exception naming In-Reply-To: <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: On Fri, Mar 30, 2018 at 8:35 AM, Brian Goetz wrote: I think we are the heart of the disagreement. I do not think that "someone > added an enum constant" is necessarily a classpath insanity thing. Its > just as likely to be a garden-variety "you made one assumption over here > and another over there" within the same codebase. > All I mean by "classpath insanity" is "all right, let me start by seeing why the version I compiled against and the version I ran against are different. Oh, how about that. If I compile against *that* jar, now I get a compile error which I know exactly how to fix." In fact, if my decision to go defaultless was intentional and I want to stick with it, then note that I *can't* even fix this problem unless I do the above first (i.e., ensure that I would get the *compile* error from recompiling my code). The same thing happens when I wrote a decorator for an abstract type and then a new method is added to that supertype. When I see AbstractMethodError, the first thing I have to do is fix my class path situation so that I get the *compile* error, before I can even fix the code. All the runtime error needs to do is prompt me to do that and then I can take it from there normally. The reasons we're trying to treat that case as different from this one seem too subtle to me. (And, we've told people for years that its OK to add enum constants.) > Yes, but we are deciding to reverse that decision 15 years later. We're doing that because we think it's worth it, but let's at least be clear about the fact that that is what we're doing. > I think its at least as likely that this is in the same category as any > other assumption we make about code we call. Like "this method never > returns null". Or "this method says it returns Object, but I know it > always returns String, so I'll cast to String." > Those situations (which suck, to be sure) I have no compile-time protection from at all, so argument I've been making doesn't apply. (Oops, I swear Liam and I are not coordinating our responses. :-)) So, it seems that the purpose of this exception is not to cast blame on the > classpath, but to pinpoint the erroneous assumption. > To repeat, "casting blame" on the classpath should help the user resolve the problem more quickly. > Maybe Y shouldn't have gotten new enum constants. Maybe X made bad > assumptions about the future maintenance trajectory of Y. Either way, X > and Y have to get together and get their story straight. So the exception > should broker that meeting. And its seems at least as likely as not that X > and Y are the same maintainer. > > If you got a stack trace that said: > > UnexpectedEnumValueException: surprising enum constant > TrafficLight.BLUE; expected one of RED, YELLOW, GREEN > at: MyClass.switchOnTrafficLightColors (line 666) > > or > > UnexpectedSealedTypeMemberException: surprising subtype Blue of > sealed type TrafficLightColors, expected one of Red, Yellow, Green > > doesn't that meet your requirement? You look at the exception, and it > says that a value was found that violated an assumption in your code. It > could either be the fault of the enum maintainer or of the switch > maintainer (who might well be the same person.) > > > On 3/30/2018 11:09 AM, Kevin Bourrillion wrote: > > I think my overarching point is still this one: > > "Today, an experienced developer knows that there is a category of Errors > that, when you see them in the absence of reflection, always implicate this > kind of classpath issue. I can't see why this would not belong in that same > category." > > > The distinction, when a stack trace has just ruined my day, of whether I > need to start thinking hard about what *real* mistake I might have made *in > my code, *or whether I probably just have Class Path Insanity I should > check out first, seems to be like a very high order distinction - more > useful to illuminate than other various distinctions we can make. > > > > On Wed, Mar 28, 2018 at 12:48 PM, Brian Goetz > wrote: > >> >> >> I have been figuring that if the client *has* a reasonable way to handle >> unknown values then it will probably go ahead and do that (with a >> `default`). >> >> >> I think that's a fair assumption for your codebase, but not in general. >> Developers will surely do this: >> >> x = switch (trafficLight) { >> case RED -> ... >> case YELLOW -> ... >> case GREEN -> ... >> } >> >> and leave out a default because they can. So they get a default default, >> one that throws. No problem. >> >> The only question here is: what to throw. My argument is that Error is >> just too strong an indicator. (It's like using fatal as your logging level >> for everything; it would be more useful to use warning for things that >> aren't fatal). >> >> From the Error doc: >> >> An Error is a subclass of Throwable that indicates serious problems that >> a reasonable application should not try to catch. Most such errors are >> abnormal conditions. >> >> Serious problems mean that underlying VM mechanism have failed. >> Encountering an unexpected input is not in this category. Sure, it >> deserves an exception, but its not an ICCE. >> >> Therefore I assumed that what we're talking about in this conversation is >> the* other* kind, where there is nothing safe they can do - for example >> if I wrote a method that displays a time interval as "10 ns" or "20 s", I >> may not find it acceptable for me to start displaying "30 " >> once I get handed TimeUnit.DAYS. My code is broken either way. If a >> constant is added, I need to react to that, just like I do with a new >> interface method. What does it really mean to say that this client "brings >> a piece of the responsibility" if it doesn't really have a choice? >> >> >> It's not unlike this: >> >> AnEnum e = f(...); >> switch (e) { >> ... >> } >> >> and not being prepared for a null. You'll get an NPE. The local code >> isn't expected to deal with it, but somewhere up the stack, someone is >> prepared to deal with it, discard the offending incoming work item, log >> what happened, and re-enter the work loop. >> >> So, I'm not quite yet following why the binary/source compatibility >> distinction, or the opt-in distinction, really makes all the difference >> here. >> >> >> Some incompatibilities are more of a fire drill than others. Binary >> incompatibilities (e.g., removing a method) are harder to recover from than >> unexpected inputs. Further, while there may be no good _local_ recover for >> an unexpected input, there often is a reasonable global recovery. Error >> means "fire drill". I claim this doesn't rise to the level of Error; it's >> more like NumberFormatException or NPE or ClassCastException. >> >> >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 30 16:32:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 12:32:16 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: Summarizing what I heard: seeing Error is your trigger to "go fix your classpath and recompile".? And implicitly, you're saying: if it is an exception (no matter how clearly worded about inconsistent assumptions across classes), it is far less likely to trigger this reaction? > > (And, we've told people for years that its OK to add enum constants.) > > > Yes, but we are deciding to reverse that decision 15 years later. > We're doing that because we think it's worth it, but let's at least be > clear about the fact that that is what we're doing. > Let's back up.? Are we really reversing this??? Or are we doing something more subtle? Is it OK to add enum constants if they are not published across maintenance boundaries? Is it OK to add enum constants if you don't use expression switches? Is it OK to add enum constants if you use expression switches with explicit defaults? Suppose you publish an API that has ??? enum TrafficLight { RED, YELLOW, GREEN } And I depend on your API with an optimistically exhaustive switch (OES).? Then you add BLUE** So, who's at fault? ?- You, for adding a switch to an enum that is published across a maintenance boundary? ?- Me, for OESing on an enum imported across a maintenance boundary? ?- Java, for letting me OES across a maintenance boundary? (The latter is a new idea, but I think is what you're getting at -- perhaps the rule should be that _within a module_, which is expected to be co-compiled, its OK to leave off the default, but for "foreign" enums/sealed types, we're not going to put any faith in the claim of sealed-ness, and make you handle the default explicitly? **Note that just _adding_ an enum is not enough to trigger an error.? Someone actually has to pass that enum to me.? And I have to switch on it.? And that switch has to be optimistically exhaustive. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 30 16:54:21 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 30 Mar 2018 18:54:21 +0200 (CEST) Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> I do not see (B) as sacrifying the consistency because the premise is that an expression switch should be consistent with ?: But an expression switch can also be modeled as a classical switch that returns it's value to a local variable. int a = switch(foo) { case 'a' -> 2; case 'b' -> 3; } can be see as int a = $switch(foo); with int $switch(char foo) { case 'a': return 2; case 'b': return 3; } In that case, given that return allows target typing, (B) seems to be the only consistent choice, or (B) is as consistent as (A) with a different premise. regards, R?mi ----- Mail original ----- > De: "daniel smith" > ?: "amber-spec-experts" > Envoy?: Mercredi 28 Mars 2018 21:37:51 > Objet: Feedback wanted: switch expression typing > (Looking for some feedback on real-world code usage. Please read to the end, > then if you can experiment with the code you work on and report back, I'd > appreciate it!) > > Switch expressions, from a type checking perspective, are basically > generalizations of conditional expressions: instead of 2 operands to check, we > have n. > > A reasonable expectation is that, if I rewrite my conditional expression as a > switch expression, it will behave the same: > > test ? foo() : bar() > is equivalent to > switch (test) { case true -> foo(); case false -> bar(); } > > So, as a starting point, the typing rules for switches should be the same as the > typing rules for conditionals, but generalized to an arbitrary number of > results. > > (The "results" of a switch expression are all expressions appearing after a '->' > or a 'break'.) > > Conditional expressions and switch expressions are typically used as poly > expressions (in a context that has a target type). But that won't always be the > case. One notable usage that doesn't have a target type is an initializer for > 'var': "var x = ...". So they are sometimes poly expressions, sometimes > standalone. > > Conditional expression typing is driven by an ad hoc categorization scheme which > looks at the result expressions and tries to predict whether they will all have > type boolean/Boolean, primitive/boxed number, or something else/a mix ("tries > to predict" because in some cases we can't type-check the expression until > we've completed the categorization). > > In the numeric case, we then identify the narrowest primitive type that can > contain the results. > > In the other/mixed case, we then type check by pushing down a target type, or, > if none is available, producing a reference type from the lub operation. > > A couple of observations: > > - The primitive vs. reference choice is meaningful, because the primitive and > reference type hierarchies are different (e.g., int can be widened to long, but > Integer can't be widened to Long). Preferring primitive typing where possible > seems like the right choice. > > - The ad hoc categorization is a bit of a mess. It's complex and imperfect. What > people probably expect is that, where a target type is available, that's what > the compiler will use?but the compiler ignores the target type in the primitive > cases. > > Why? Well, in 8, when we introduced target typing of conditionals, we identified > some incompatibilities that would occur if we changed the handing of > primitives, and we didn't want to be disruptive. > > Some examples: > Boolean x = test ? z : zbox; // specified: can NPE; target typing: no null check > Integer x = test ? s : i; // specified: ok; target typing: can't convert > short->Integer > Number x = test ? s : i; // specified: box to Integer; target typing: box to > Short or Integer > double d = test ? l : f; // specified: long->float loses precision; target > typing: long->double better precision > m(test ? z : zbox); // specified: prefers m(boolean); target typing: m(boolean) > and m(Boolean) are ambiguous > > At this point, we've got a choice: > A) Fully mimic the conditional behavior in switch expressions > B) Do target typing (when available) for all switch expressions, diverging from > conditionals > C) Do target typing (when available) for all switches and conditionals, > accepting the incompatibilities > > (A) sacrifices simplicity. (B) sacrifices consistency. (C) sacrifices > compatibility. > > General thoughts on simplicity (is the current behavior hard to understand?) and > consistency (is it bad if the conditional/switch refactoring leads to subtly > different typing?) are welcome. > > And we could use some clarification is just how significant the compatibility > costs of (C) are. With that in mind, here's a javac patch: > > http://cr.openjdk.java.net/~dlsmith/logPrimitiveConditionals.patch > > A javac built with this patch supports an option that will output diagnostics > wherever conditionals at risk of incompatible change are detected: > > javac -XDlogPrimitiveConditionals Foo.java > > If you're able to build OpenJDK with this patch and run it on some real-world > code, I'd appreciate any insights about what you find. > > ?Dan From kevinb at google.com Fri Mar 30 17:37:26 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 30 Mar 2018 10:37:26 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: Although the depth of this debate may seem to have exceeded its value in resolving the question at hand, I think it's usually worthwhile to try hammer out these kinds of differences, until we all have a greater common understanding (or clearly arrive at an impasse...). On Fri, Mar 30, 2018 at 9:32 AM, Brian Goetz wrote: Summarizing what I heard: seeing Error is your trigger to "go fix your > classpath and recompile". And implicitly, you're saying: if it is an > exception (no matter how clearly worded about inconsistent assumptions > across classes), it is far less likely to trigger this reaction? > More or less, though let's not debate the precise meaning of "far". :-) I believe that developers would like to assume that IncompatibleClassChangeError means what it says it means, and that the JDK follows the Effective Java advice of always reusing appropriate exception types. (And, we've told people for years that its OK to add enum constants.) >> > > Yes, but we are deciding to reverse that decision 15 years later. We're > doing that because we think it's worth it, but let's at least be clear > about the fact that that is what we're doing. > > Let's back up. Are we really reversing this? Or are we doing something > more subtle? > > Is it OK to add enum constants if they are not published across > maintenance boundaries? > Is it OK to add enum constants if you don't use expression switches? > Is it OK to add enum constants if you use expression switches with > explicit defaults? > Like any other incompatible change, we can say "no, it's compatible as long as callers aren't doing X Y Z...", or "it's not so bad if it's not exported". The fact that there are these conditions at all is what makes it "incompatible". (Note that I do concede that many types of changes we consider to be source-compatible still *have* those conditions - e.g. if you add a method someone might have been wildcard-static-importing from both you and another class, that kind of thing -- but I think we generally deem them rare enough to not be worth worrying about; that's not what we're talking about here.) Suppose you publish an API that has > enum TrafficLight { RED, YELLOW, GREEN } > > And I depend on your API with an optimistically exhaustive switch (OES). > Then you add BLUE** > > So, who's at fault? > - You, for adding a switch to an enum that is published across a > maintenance boundary? > - Me, for OESing on an enum imported across a maintenance boundary? > - Java, for letting me OES across a maintenance boundary? > Just an observation: you've introduced the word "blame" and now "fault" to this discussion, but I think they aren't the real point. I think the relevant question is not "who's at fault" but "how do I proceed as quickly as possible to fixing it". Now, either this came to my attention through a compile-time or runtime error. If the former, it is clear what is going on, and is part of my normal workflow for how I get my code to work. If the latter, I'm suggesting that the best thing we can do is to prompt the developer to wonder "wait, why didn't I get a compile error?" so that it reduces to the former. (The latter is a new idea, but I think is what you're getting at -- perhaps > the rule should be that _within a module_, which is expected to be > co-compiled, its OK to leave off the default, but for "foreign" > enums/sealed types, we're not going to put any faith in the claim of > sealed-ness, and make you handle the default explicitly? > I'm not sure I understand this, and therefore I suspect it's not what I'm getting at. :-) > **Note that just _adding_ an enum is not enough to trigger an error. > Someone actually has to pass that enum to me. And I have to switch on it. > And that switch has to be optimistically exhaustive. > Same as with the decorator, right? Someone has to actually call the method. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 30 17:48:03 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 13:48:03 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: > Although the depth of this debate may seem to have exceeded its value > in resolving the question at hand, I think it's usually worthwhile to > try hammer out these kinds of differences, until we all have a greater > common understanding (or clearly arrive at an impasse...). Nah, I'm not grumpy yet :) > Now, either this came to my attention through a compile-time or > runtime error. If the former, it is clear what is going on, and is > part of my normal workflow for how I get my code to work. If the > latter, I'm suggesting that the best thing we can do is to prompt the > developer to wonder "wait, why didn't I get a compile error?" so that > it reduces to the former. Backing way up, Alex had suggested that the right exception is (a subtype of) IncompatibleClassChangeEXCEPTION, rather than Error.? I was concerned that ICC* would seem too low-level to users, though. But you're saying ICCE and subtypes are helpful to suers, because they guide users to "blame your classpath".? SO in that case, is the ICC part a good enough trigger? > > > (The latter is a new idea, but I think is what you're getting at > -- perhaps the rule should be that _within a module_, which is > expected to be co-compiled, its OK to leave off the default, but > for "foreign" enums/sealed types, we're not going to put any faith > in the claim of sealed-ness, and make you handle the default > explicitly? > > > I'm not sure I understand this, and therefore I suspect it's not what > I'm getting at. :-) Expanding: For an enum in the same class/package/module as the switch, the chance of getting the error at runtime is either zero (same class) or effectively zero (same package or module), because all sane developers build packages and modules in an atomic operation. For an enum in a different module as the switch, the chance of getting the error at runtime is nonzero, because we're linking against a JAR at runtime. So an alternative here is to tweak the language so that the "conclude exhaustiveness if all enum constants are present" behavior should be reserved for the cases where the switch and the enum are in the same module? (Just a thought.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 30 18:31:49 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 30 Mar 2018 11:31:49 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: On Fri, Mar 30, 2018 at 10:48 AM, Brian Goetz wrote: Backing way up, Alex had suggested that the right exception is (a subtype > of) IncompatibleClassChangeEXCEPTION, rather than Error. I was concerned > that ICC* would seem too low-level to users, though. But you're saying > ICCE and subtypes are helpful to suers, because they guide users to "blame > your classpath". SO in that case, is the ICC part a good enough trigger? > (Just to be clear, Remi and I have been advocating for a subtype of ICC *Error* all along, in case anyone missed that.) All right, I've been focusing too much on the hierarchy, but the leaf-level name is more important than that (and the message text further still, and since I assume we'll do a fine job of that, I can probably relax a little). To answer your question, sure, the "ICC" is a pretty decent signal. Have we discussed Cyrill's point on -observers that we should create more specific exception types, such as UnrecognizedEnumConstantE{rror,xception}? For an enum in the same class/package/module as the switch, the chance of > getting the error at runtime is either zero (same class) or effectively > zero (same package or module), because all sane developers build packages > and modules in an atomic operation. > > For an enum in a different module as the switch, the chance of getting the > error at runtime is nonzero, because we're linking against a JAR at > runtime. > > So an alternative here is to tweak the language so that the "conclude > exhaustiveness if all enum constants are present" behavior should be > reserved for the cases where the switch and the enum are in the same > module? > > (Just a thought.) > Okay, that is a sane approach, but I think it leaves too much of the value on the floor. I often benefit from having my exhaustiveness validated and being able to find out at compile time if things change in the future. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 30 18:39:30 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 14:39:30 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> > All right, I've been focusing too much on the hierarchy, but the > leaf-level name is more important than that (and the message text > further still, and since I assume we'll do a fine job of that, I can > probably relax a little). To answer your question, sure, the "ICC" is > a pretty decent signal. Have we discussed Cyrill's point on -observers > that we should create more specific exception types, such as > UnrecognizedEnumConstantE{rror,xception}? Yes.? What I'd been proposing was something like: class IncompatibleClassChangeException <: Exception or classUnexpectedClassChangeException <: Exception and then UnexpectedEnumConstantException <: {I,U}CCE UnexpectedSealedTypeException <: {I,U}CCE > Okay, that is a sane approach, but I think it leaves too much of the > value on the floor. I often benefit from having my exhaustiveness > validated and being able to find out at compile time if things change > in the future. To be clear, I was describing: ?- We'd always do exhaustiveness checking for expression switches ?- A default / total pattern always implies exhaustive ?- We'd additionally consider an expression switch to be exhaustive if all known enums are present _and_ the enum type is in the same module as the switch But that's probably too fussy. From forax at univ-mlv.fr Fri Mar 30 18:40:55 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 30 Mar 2018 20:40:55 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: <1181013078.2259809.1522435255441.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Kevin Bourrillion" > Cc: "amber-spec-experts" > Envoy?: Vendredi 30 Mars 2018 19:48:03 > Objet: Re: Expression switch exception naming >> Although the depth of this debate may seem to have exceeded its value in >> resolving the question at hand, I think it's usually worthwhile to try hammer >> out these kinds of differences, until we all have a greater common >> understanding (or clearly arrive at an impasse...). > Nah, I'm not grumpy yet :) >> Now, either this came to my attention through a compile-time or runtime error. >> If the former, it is clear what is going on, and is part of my normal workflow >> for how I get my code to work. If the latter, I'm suggesting that the best >> thing we can do is to prompt the developer to wonder "wait, why didn't I get a >> compile error?" so that it reduces to the former. > Backing way up, Alex had suggested that the right exception is (a subtype of) > IncompatibleClassChangeEXCEPTION, rather than Error. I was concerned that ICC* > would seem too low-level to users, though. But you're saying ICCE and subtypes > are helpful to suers, because they guide users to "blame your classpath". SO in > that case, is the ICC part a good enough trigger? >>> (The latter is a new idea, but I think is what you're getting at -- perhaps the >>> rule should be that _within a module_, which is expected to be co-compiled, its >>> OK to leave off the default, but for "foreign" enums/sealed types, we're not >>> going to put any faith in the claim of sealed-ness, and make you handle the >>> default explicitly? >> I'm not sure I understand this, and therefore I suspect it's not what I'm >> getting at. :-) > Expanding: > For an enum in the same class/package/module as the switch, the chance of > getting the error at runtime is either zero (same class) or effectively zero > (same package or module), because all sane developers build packages and > modules in an atomic operation. > For an enum in a different module as the switch, the chance of getting the error > at runtime is nonzero, because we're linking against a JAR at runtime. > So an alternative here is to tweak the language so that the "conclude > exhaustiveness if all enum constants are present" behavior should be reserved > for the cases where the switch and the enum are in the same module? > (Just a thought.) Not having the same behavior due to a refactoring that introduces an intermediary module seems a big no no for me. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 30 18:42:38 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 30 Mar 2018 20:42:38 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> Message-ID: <1582454420.2259931.1522435358794.JavaMail.zimbra@u-pem.fr> You still have not explain why you want to recover from one of these exception knowning that it's simpler to add a default if you want to take care of an unknown enum constant. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Kevin Bourrillion" > Cc: "amber-spec-experts" > Envoy?: Vendredi 30 Mars 2018 20:39:30 > Objet: Re: Expression switch exception naming >> All right, I've been focusing too much on the hierarchy, but the >> leaf-level name is more important than that (and the message text >> further still, and since I assume we'll do a fine job of that, I can >> probably relax a little). To answer your question, sure, the "ICC" is >> a pretty decent signal. Have we discussed Cyrill's point on -observers >> that we should create more specific exception types, such as >> UnrecognizedEnumConstantE{rror,xception}? > > Yes.? What I'd been proposing was something like: > > class IncompatibleClassChangeException <: Exception > or > classUnexpectedClassChangeException <: Exception > > and then > > UnexpectedEnumConstantException <: {I,U}CCE > UnexpectedSealedTypeException <: {I,U}CCE > > >> Okay, that is a sane approach, but I think it leaves too much of the >> value on the floor. I often benefit from having my exhaustiveness >> validated and being able to find out at compile time if things change >> in the future. > > To be clear, I was describing: > ?- We'd always do exhaustiveness checking for expression switches > ?- A default / total pattern always implies exhaustive > ?- We'd additionally consider an expression switch to be exhaustive if > all known enums are present _and_ the enum type is in the same module as > the switch > > But that's probably too fussy. From brian.goetz at oracle.com Fri Mar 30 18:44:23 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 14:44:23 -0400 Subject: Expression switch exception naming In-Reply-To: <1582454420.2259931.1522435358794.JavaMail.zimbra@u-pem.fr> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <1582454420.2259931.1522435358794.JavaMail.zimbra@u-pem.fr> Message-ID: I'm not talking about recovering.? This is purely taxonomy; this sort of mismatch does not (IMO) rise to the level of Error. On 3/30/2018 2:42 PM, Remi Forax wrote: > You still have not explain why you want to recover from one of these exception knowning that it's simpler to add a default if you want to take care of an unknown enum constant. > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Kevin Bourrillion" >> Cc: "amber-spec-experts" >> Envoy?: Vendredi 30 Mars 2018 20:39:30 >> Objet: Re: Expression switch exception naming >>> All right, I've been focusing too much on the hierarchy, but the >>> leaf-level name is more important than that (and the message text >>> further still, and since I assume we'll do a fine job of that, I can >>> probably relax a little). To answer your question, sure, the "ICC" is >>> a pretty decent signal. Have we discussed Cyrill's point on -observers >>> that we should create more specific exception types, such as >>> UnrecognizedEnumConstantE{rror,xception}? >> Yes.? What I'd been proposing was something like: >> >> class IncompatibleClassChangeException <: Exception >> or >> classUnexpectedClassChangeException <: Exception >> >> and then >> >> UnexpectedEnumConstantException <: {I,U}CCE >> UnexpectedSealedTypeException <: {I,U}CCE >> >> >>> Okay, that is a sane approach, but I think it leaves too much of the >>> value on the floor. I often benefit from having my exhaustiveness >>> validated and being able to find out at compile time if things change >>> in the future. >> To be clear, I was describing: >> ?- We'd always do exhaustiveness checking for expression switches >> ?- A default / total pattern always implies exhaustive >> ?- We'd additionally consider an expression switch to be exhaustive if >> all known enums are present _and_ the enum type is in the same module as >> the switch >> >> But that's probably too fussy. From forax at univ-mlv.fr Fri Mar 30 18:50:58 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 30 Mar 2018 20:50:58 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <1582454420.2259931.1522435358794.JavaMail.zimbra@u-pem.fr> Message-ID: <1630720866.2260517.1522435858717.JavaMail.zimbra@u-pem.fr> Do we have another case where we actually throw a runtime exception due to a separate compilation issue ? R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Kevin Bourrillion" , "amber-spec-experts" > Envoy?: Vendredi 30 Mars 2018 20:44:23 > Objet: Re: Expression switch exception naming > I'm not talking about recovering.? This is purely taxonomy; this sort of > mismatch does not (IMO) rise to the level of Error. > > On 3/30/2018 2:42 PM, Remi Forax wrote: >> You still have not explain why you want to recover from one of these exception >> knowning that it's simpler to add a default if you want to take care of an >> unknown enum constant. >> >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Kevin Bourrillion" >>> Cc: "amber-spec-experts" >>> Envoy?: Vendredi 30 Mars 2018 20:39:30 >>> Objet: Re: Expression switch exception naming >>>> All right, I've been focusing too much on the hierarchy, but the >>>> leaf-level name is more important than that (and the message text >>>> further still, and since I assume we'll do a fine job of that, I can >>>> probably relax a little). To answer your question, sure, the "ICC" is >>>> a pretty decent signal. Have we discussed Cyrill's point on -observers >>>> that we should create more specific exception types, such as >>>> UnrecognizedEnumConstantE{rror,xception}? >>> Yes.? What I'd been proposing was something like: >>> >>> class IncompatibleClassChangeException <: Exception >>> or >>> classUnexpectedClassChangeException <: Exception >>> >>> and then >>> >>> UnexpectedEnumConstantException <: {I,U}CCE >>> UnexpectedSealedTypeException <: {I,U}CCE >>> >>> >>>> Okay, that is a sane approach, but I think it leaves too much of the >>>> value on the floor. I often benefit from having my exhaustiveness >>>> validated and being able to find out at compile time if things change >>>> in the future. >>> To be clear, I was describing: >>> ?- We'd always do exhaustiveness checking for expression switches >>> ?- A default / total pattern always implies exhaustive >>> ?- We'd additionally consider an expression switch to be exhaustive if >>> all known enums are present _and_ the enum type is in the same module as >>> the switch >>> > >> But that's probably too fussy. From kevinb at google.com Fri Mar 30 18:55:09 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 30 Mar 2018 11:55:09 -0700 Subject: Expression switch exception naming In-Reply-To: <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> Message-ID: On Fri, Mar 30, 2018 at 11:39 AM, Brian Goetz wrote: Okay, that is a sane approach, but I think it leaves too much of the value >> on the floor. I often benefit from having my exhaustiveness validated and >> being able to find out at compile time if things change in the future. >> > > To be clear, I was describing: > - We'd always do exhaustiveness checking for expression switches > - A default / total pattern always implies exhaustive > - We'd additionally consider an expression switch to be exhaustive if all > known enums are present _and_ the enum type is in the same module as the > switch > Confirming that this is indeed how I understood it. I think it throws too much value out. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From cushon at google.com Fri Mar 30 18:58:27 2018 From: cushon at google.com (Liam Miller-Cushon) Date: Fri, 30 Mar 2018 18:58:27 +0000 Subject: Expression switch exception naming In-Reply-To: <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> Message-ID: On Fri, Mar 30, 2018 at 11:40 AM Brian Goetz wrote: > - We'd additionally consider an expression switch to be exhaustive if > all known enums are present _and_ the enum type is in the same module as > the switch > > But that's probably too fussy. > I think it would be surprising for this to depend on where the declaration of the enum was located. Treating expression switches that handle all constants of an enum as exhaustive might be most valuable across compatibility boundaries. If the enum is declared nearby, adding a method to the enum is often a good alternative to switching on it. When switches are used to associate behaviour with enums in other libraries that can't be modified directly, having a way to ensure those switches are updated if values are added to the enum eliminates a category of bugs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Fri Mar 30 19:06:46 2018 From: mark at io7m.com (Mark Raynsford) Date: Fri, 30 Mar 2018 19:06:46 +0000 Subject: Expression switch exception naming In-Reply-To: <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> Message-ID: <20180330190646.313c877f@copperhead.int.arc7.info> On 2018-03-30T14:39:30 -0400 Brian Goetz wrote: > > To be clear, I was describing: > ?- We'd always do exhaustiveness checking for expression switches > ?- A default / total pattern always implies exhaustive > ?- We'd additionally consider an expression switch to be exhaustive if > all known enums are present _and_ the enum type is in the same module as > the switch > > But that's probably too fussy. That seems rather unpleasant: If my API returns values of a sealed type and I expect API consumers to match on/switch on values of that type (consider something like Scala's Either type), it'd be very nasty if they didn't get exhaustiveness checks just because the consumers live outside of my module. Perhaps I've misunderstood and that wasn't what was intended? Additionally... If we're tying things to modules, what will happen to OSGi? The module system there isn't integrated with the JPMS in any sense yet. I suppose you could sort of argue that the entire OSGi system lives in the unnamed module, but ... -- Mark Raynsford | http://www.io7m.com From brian.goetz at oracle.com Fri Mar 30 19:10:32 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 15:10:32 -0400 Subject: Expression switch exception naming In-Reply-To: <20180330190646.313c877f@copperhead.int.arc7.info> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <20180330190646.313c877f@copperhead.int.arc7.info> Message-ID: <84d0bd76-2089-30fa-4d07-6801e4b82218@oracle.com> Yes, you misunderstood :) You would always get an exhaustiveness check.? What you'd not get is the "grace" of having said: ?? case RED: ?? case YELLOW: ?? case GREEN: without a default, and having that still be considered exhaustive because these are all the alternatives known at compile time.? It would be like today, where flow analysis sometimes requires you to have a default on an enum switch even though you've covered all the bases. On 3/30/2018 3:06 PM, Mark Raynsford wrote: > That seems rather unpleasant: If my API returns values of a sealed type > and I expect API consumers to match on/switch on values of that type > (consider something like Scala's Either type), it'd be very nasty if > they didn't get exhaustiveness checks just because the consumers live > outside of my module. > > Perhaps I've misunderstood and that wasn't what was intended? From mark at io7m.com Fri Mar 30 20:07:11 2018 From: mark at io7m.com (Mark Raynsford) Date: Fri, 30 Mar 2018 20:07:11 +0000 Subject: Expression switch exception naming In-Reply-To: <84d0bd76-2089-30fa-4d07-6801e4b82218@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <20180330190646.313c877f@copperhead.int.arc7.info> <84d0bd76-2089-30fa-4d07-6801e4b82218@oracle.com> Message-ID: <20180330200711.22e83be1@copperhead.int.arc7.info> On 2018-03-30T15:10:32 -0400 Brian Goetz wrote: > Yes, you misunderstood :) > > You would always get an exhaustiveness check.? What you'd not get is the > "grace" of having said: > > ?? case RED: > ?? case YELLOW: > ?? case GREEN: > > without a default, and having that still be considered exhaustive > because these are all the alternatives known at compile time.? It would > be like today, where flow analysis sometimes requires you to have a > default on an enum switch even though you've covered all the bases. There's a subtlety to the language here that I think might be throwing me off. It's not clear to me what the utility of nominally always having an exhaustiveness check would be if I end up having to include a "default" everywhere anyway. If someone adds an enum constant (and I'm compiling my code against their new code), I want to get compilation errors in every switch that now needs to be updated. If I have to add a "default" everywhere in the source, I won't get that (and will have to hope that my test coverage is good enough to find all the switches that are now incorrect). My understanding to date has been that a "default" wasn't going to be required for enum and sealed types, and that if I didn't provide one, the compiler would synthesize one that raises an exception... Now things seem to have shifted somewhat. I configure my IDE to refuse to allow me to use default on enum switches, and to treat missing cases as a compilation error: switch (e) { case RED: ... case YELLOW: ... case GREEN: ... } throw new UnreachableCodeException(); I don't specify a default, so the code raises UnreachableCodeException if someone makes a binary-incompatible change and I've not recompiled, and the IDE/compiler tells me if someone added an enum constant that I've not handled by refusing to compile my code. -- Mark Raynsford | http://www.io7m.com From brian.goetz at oracle.com Fri Mar 30 20:17:35 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Mar 2018 16:17:35 -0400 Subject: Expression switch exception naming In-Reply-To: <20180330200711.22e83be1@copperhead.int.arc7.info> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <20180330190646.313c877f@copperhead.int.arc7.info> <84d0bd76-2089-30fa-4d07-6801e4b82218@oracle.com> <20180330200711.22e83be1@copperhead.int.arc7.info> Message-ID: > It's not clear to me what the utility of nominally always having an > exhaustiveness check would be if I end up having to include a "default" > everywhere anyway. If someone adds an enum constant (and I'm compiling > my code against their new code), I want to get compilation errors in > every switch that now needs to be updated. If I have to add a "default" > everywhere in the source, I won't get that (and will have to hope that > my test coverage is good enough to find all the switches that are now > incorrect). OK, we have a terminology confusion over the term "exhaustiveness checking."? I meant that the compiler won't let you write an inexhaustive switch expression (which, for almost all target types, will require a default/total pattern), even though it will let you for switch statements (in the absence of flow constraints to the contrary, such as a blank local.) > My understanding to date has been that a "default" wasn't going to be > required for enum and sealed types, and that if I didn't provide one, > the compiler would synthesize one that raises an exception... Now > things seem to have shifted somewhat. They haven't shifted; I was describing this option as a means of getting at Kevin's distinction about classpath dependencies.? Within a maintenance domain, you basically never have to worry about your enums and switches over those enums getting out of sync; across maintenance domains, you do.? So the question was, should we consider treating cross-module "hope nothing changes between now and runtime" assumptions more skeptically.? It was a thought experiment, not a proposal, aimed at closing the loop (since this thread has had an awful lot of talking-past-each-other.) > I configure my IDE to refuse to allow me to use default on enum > switches, and to treat missing cases as a compilation error: > > switch (e) { > case RED: ... > case YELLOW: ... > case GREEN: ... > } > throw new UnreachableCodeException(); We'd not do anything for switch statements.? Exhaustiveness checking is only for switch expressions in any case. But, your point is taken; *not* having a default in a situation that requires exhaustiveness acts as a type-check on that exhaustiveness, and saying default will then cover up any sins.? I get it. From mark at io7m.com Fri Mar 30 20:33:42 2018 From: mark at io7m.com (Mark Raynsford) Date: Fri, 30 Mar 2018 20:33:42 +0000 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <2be3dc25-6795-ae43-57ff-0226a034ed21@oracle.com> <20180330190646.313c877f@copperhead.int.arc7.info> <84d0bd76-2089-30fa-4d07-6801e4b82218@oracle.com> <20180330200711.22e83be1@copperhead.int.arc7.info> Message-ID: <20180330203342.60ac2564@copperhead.int.arc7.info> On 2018-03-30T16:17:35 -0400 Brian Goetz wrote: > > OK, we have a terminology confusion over the term "exhaustiveness > checking." Got it, I'm up to speed! > But, your point is taken; *not* having a default in a situation that > requires exhaustiveness acts as a type-check on that exhaustiveness, and > saying default will then cover up any sins.? I get it. Yep, that's the bit I'd hate to lose (not that I actually have it right now :]). -- Mark Raynsford | http://www.io7m.com From daniel.smith at oracle.com Sat Mar 31 01:44:49 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 30 Mar 2018 19:44:49 -0600 Subject: Feedback wanted: switch expression typing In-Reply-To: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> References: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> Message-ID: > On Mar 30, 2018, at 10:54 AM, Remi Forax wrote: > > I do not see (B) as sacrifying the consistency because the premise is that an expression switch should be consistent with ?: > > But an expression switch can also be modeled as a classical switch that returns it's value to a local variable. > > int a = switch(foo) { > case 'a' -> 2; > case 'b' -> 3; > } > can be see as > int a = $switch(foo); > with > int $switch(char foo) { > case 'a': return 2; > case 'b': return 3; > } I mean, sure, this is another way to assert "switches in assignment contexts should always be poly expressions". But it's just as easy to assert "conditional expressions in assignment contexts should always be poly expressions". int a = test ? 2 : 3; can be seen as int a = $conditional(test); with int $conditional(boolean test) { if (test) return 2; else return 3; } Those are probably good principles. But if we embrace them, we're doing (C). ?Dan From forax at univ-mlv.fr Sat Mar 31 10:23:50 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 31 Mar 2018 12:23:50 +0200 (CEST) Subject: Feedback wanted: switch expression typing In-Reply-To: References: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> Message-ID: <17865431.2309562.1522491830602.JavaMail.zimbra@u-pem.fr> The fact that the semantics of ?: is very ad-hoc is a kind of accident of the history, we may want to fix it but i do not see why we have to fix it at the same time that we introduce the expression switch, we can fix the semantics of ?: later or never. R?mi ----- Mail original ----- > De: "daniel smith" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Samedi 31 Mars 2018 03:44:49 > Objet: Re: Feedback wanted: switch expression typing >> On Mar 30, 2018, at 10:54 AM, Remi Forax wrote: >> >> I do not see (B) as sacrifying the consistency because the premise is that an >> expression switch should be consistent with ?: >> >> But an expression switch can also be modeled as a classical switch that returns >> it's value to a local variable. >> >> int a = switch(foo) { >> case 'a' -> 2; >> case 'b' -> 3; >> } >> can be see as >> int a = $switch(foo); >> with >> int $switch(char foo) { >> case 'a': return 2; >> case 'b': return 3; >> } > > I mean, sure, this is another way to assert "switches in assignment contexts > should always be poly expressions". > > But it's just as easy to assert "conditional expressions in assignment contexts > should always be poly expressions". > > int a = test ? 2 : 3; > can be seen as > int a = $conditional(test); > with > int $conditional(boolean test) { > if (test) return 2; > else return 3; > } > > Those are probably good principles. But if we embrace them, we're doing (C). > > ?Dan From dl at cs.oswego.edu Sat Mar 31 11:13:46 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 31 Mar 2018 07:13:46 -0400 Subject: Feedback wanted: switch expression typing In-Reply-To: <17865431.2309562.1522491830602.JavaMail.zimbra@u-pem.fr> References: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> <17865431.2309562.1522491830602.JavaMail.zimbra@u-pem.fr> Message-ID: On Sat, March 31, 2018 6:23 am, forax at univ-mlv.fr wrote: > The fact that the semantics of ?: is very ad-hoc is a kind of accident of > the history, > we may want to fix it but i do not see why we have to fix it at the same > time that we introduce the expression switch, > we can fix the semantics of ?: later or never. Where "later" probably means "never". It should be fixed now. I agree that (B) and (C) are basically the same, so choose (C). I've had to fiddle with :? to get the compiler to shut up about reasonable-looking expressions. (Sorry, I can't recall examples.) Having the same story for both of them would be best, assuming that existing code doesn't break. -Doug > > R??mi > > ----- Mail original ----- >> De: "daniel smith" >> ??: "Remi Forax" >> Cc: "amber-spec-experts" >> Envoy??: Samedi 31 Mars 2018 03:44:49 >> Objet: Re: Feedback wanted: switch expression typing > >>> On Mar 30, 2018, at 10:54 AM, Remi Forax wrote: >>> >>> I do not see (B) as sacrifying the consistency because the premise is >>> that an >>> expression switch should be consistent with ?: >>> >>> But an expression switch can also be modeled as a classical switch that >>> returns >>> it's value to a local variable. >>> >>> int a = switch(foo) { >>> case 'a' -> 2; >>> case 'b' -> 3; >>> } >>> can be see as >>> int a = $switch(foo); >>> with >>> int $switch(char foo) { >>> case 'a': return 2; >>> case 'b': return 3; >>> } >> >> I mean, sure, this is another way to assert "switches in assignment >> contexts >> should always be poly expressions". >> >> But it's just as easy to assert "conditional expressions in assignment >> contexts >> should always be poly expressions". >> >> int a = test ? 2 : 3; >> can be seen as >> int a = $conditional(test); >> with >> int $conditional(boolean test) { >> if (test) return 2; >> else return 3; >> } >> >> Those are probably good principles. But if we embrace them, we're doing >> (C). >> >> ???Dan > From dl at cs.oswego.edu Sat Mar 31 11:56:44 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 31 Mar 2018 07:56:44 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: On Fri, March 30, 2018 1:48 pm, Brian Goetz wrote: > > So an alternative here is to tweak the language so that the "conclude > exhaustiveness if all enum constants are present" behavior should be > reserved for the cases where the switch and the enum are in the same > module? > I might have missed discussion of this, but has anyone considered the alternative of finally allowing "final" on an enum class? In this case, several sets of simpler alternatives would be possible. -Doug From forax at univ-mlv.fr Sat Mar 31 12:14:26 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 31 Mar 2018 14:14:26 +0200 (CEST) Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: <1705576534.2315182.1522498466503.JavaMail.zimbra@u-pem.fr> An enum class is always sealed, there is a fixed number of constants values, and there is also a fixed number of subtypes, otherwise values() is not correctly implemented. R?mi ----- Mail original ----- > De: "Doug Lea"
> ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Samedi 31 Mars 2018 13:56:44 > Objet: Re: Expression switch exception naming > On Fri, March 30, 2018 1:48 pm, Brian Goetz wrote: > >> >> So an alternative here is to tweak the language so that the "conclude >> exhaustiveness if all enum constants are present" behavior should be >> reserved for the cases where the switch and the enum are in the same >> module? >> > > I might have missed discussion of this, but has anyone considered > the alternative of finally allowing "final" on an enum class? In > this case, several sets of simpler alternatives would be possible. > > -Doug