From dl at cs.oswego.edu Sun Apr 1 11:59:07 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 1 Apr 2018 07:59:07 -0400 Subject: Expression switch exception naming In-Reply-To: <1705576534.2315182.1522498466503.JavaMail.zimbra@u-pem.fr> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <1705576534.2315182.1522498466503.JavaMail.zimbra@u-pem.fr> Message-ID: On Sat, March 31, 2018 8:14 am, Remi Forax wrote: > An enum class is always sealed, there is a fixed number of constants > values, and there is also a fixed number of subtypes, > otherwise values() is not correctly implemented. Right. My point was that allowing "final" might allow programmers to distinguish the cases under question, rather forcing a new arbitrary rule about switch and enum being in same module, or whatever. -Doug Is non-exhaustiveness an Error or not? > > R??mi > > ----- Mail original ----- >> De: "Doug Lea"
>> ??: "Brian Goetz" >> Cc: "amber-spec-experts" >> Envoy??: Samedi 31 Mars 2018 13:56:44 >> Objet: Re: Expression switch exception naming > >> On Fri, March 30, 2018 1:48 pm, Brian Goetz wrote: >> >>> >>> So an alternative here is to tweak the language so that the "conclude >>> exhaustiveness if all enum constants are present" behavior should be >>> reserved for the cases where the switch and the enum are in the same >>> module? >>> >> >> I might have missed discussion of this, but has anyone considered >> the alternative of finally allowing "final" on an enum class? In >> this case, several sets of simpler alternatives would be possible. >> >> -Doug > From brian.goetz at oracle.com Tue Apr 3 16:36:43 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 3 Apr 2018 12:36:43 -0400 Subject: Compile-time type hierarchy information in pattern switch Message-ID: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> Along the lines of the previous discussion about separate compilation skew with enums ... I'm trying to find the right place to draw the line with respect to post-compilation class hierarchy changes. Recall that we can impose a _dominance ordering_ on patterns; pattern P dominates Q if everything that is matched by Q also is matched by P. We already use this today, in catch blocks, to reject programs with dead code; you can't say `catch Exception` before `catch IOException`, because the latter block would be dead. We want to do the same with patterns, so: case String x: ... case Object x: ... is OK but case Object x: ... case String x: ... is rejected at compile time. Separately, we'd like for pattern matching to be efficient; the definition of "inefficient" would be for pattern matching to be inherently O(n), when we can frequently do much better. There's plenty of literature on compiling patterns to decision trees, but none of them address the problem we have to: separate compilation. So any decision tree computed at compile time might be wrong in undesirable ways by runtime. We could also compute a decision tree at runtime using indy; while this is our intent, the devil is in the details. We don't want computing the tree to be too expensive, nor do we want to have to capture O(n^2) compile-time constraints to be validated at runtime. So I'd like to focus on what changes we're willing to accept between compilation and runtime, what our expectations would be for those changes. We've already discussed one of these: novel values in enum / sealed type switches, and for them, the answer is throwing some sort of exception. Another that we dealt with long ago is changing enum ordinals; we decided at the time that we're willing for this to be a BC change, so we generate extra code that uses the as-runtime ordinals rather than the as-compile-time ordinals when lowering the switch into an integer switch. (If we weren't willing to tolerate such changes, we'd have a simpler translation: just lower an enum switch to a switch on its ordinal.) Here's one that I suspect we're not expecting to recover terribly well from: hierarchy inversion. Suppose at compile time A <: B. So the following is a sensible switch body: case String: println("String"); break; case Object: println("Object"); break; Now, imagine that by runtime, String no longer extends Object, but instead Object absurdly extends String. Do we still expect the above to print String for all Strings, and Object for everything else? Or is the latter arm now dead at runtime, even though it wouldn't compile after the change? Or is this now UB, because it would no longer compile? A more realistic example of a hierarchy change is introducing an interface. If we have: interface I { } class C { } and a switch case I: ... case C: ... and later, we make C implement I, we have a similar situation; the switch would no longer compile. Are we allowed to make optimizations based on the compile-time knowledge that C nonfinal, etc.) From mark at io7m.com Wed Apr 4 17:01:05 2018 From: mark at io7m.com (Mark Raynsford) Date: Wed, 4 Apr 2018 17:01:05 +0000 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> Message-ID: <20180404170105.5486a1c7@copperhead.int.arc7.info> On 2018-04-03T12:36:43 -0400 Brian Goetz wrote: > > Here's one that I suspect we're not expecting to recover terribly well > from: hierarchy inversion. Suppose at compile time A <: B. So the > following is a sensible switch body: > > case String: println("String"); break; > case Object: println("Object"); break; > > Now, imagine that by runtime, String no longer extends Object, but > instead Object absurdly extends String. Do we still expect the above to > print String for all Strings, and Object for everything else? Or is the > latter arm now dead at runtime, even though it wouldn't compile after > the change? Or is this now UB, because it would no longer compile? I'm still giving thought to everything you've written, but I am wondering: How feasible is it to get the above to fail early with an informative exception/Error? -- Mark Raynsford | http://www.io7m.com From brian.goetz at oracle.com Wed Apr 4 17:07:17 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 4 Apr 2018 13:07:17 -0400 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <20180404170105.5486a1c7@copperhead.int.arc7.info> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> Message-ID: <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> The intended implementation strategy is to lower complex switches to densely-numbered `int` switches, and then invoke a classifier function that takes a target and returns the int corresponding to the lowered case number.? The classifier function will be an `invokedynamic`, whose static bootstrap will contain a summary of the patterns.? (We've already done this for switches on strings, enums, longs, non-dense ints, etc.) To deliver an early error, that means that (a) the compiler must encode through the static argument list all the assumptions it needs verified at runtime (e.g., `String <: Object`), and (b) at linkage time (the first time the switch is executed), those have to be tested. Doing so is plenty easy, but there's a startup cost, which could be as bad as _O(n^2)_, if I have to validate that no two case labels are ordered inconsistently with subtyping. A possible mitigation is to do the check as a system assertion, which only gets run if we are run with `-esa`; we then might still have some static code bloat (depending on how we encode the assumptions), but at least skip the dynamic check most of the time. On 4/4/2018 1:01 PM, Mark Raynsford wrote: > I'm still giving thought to everything you've written, but I am > wondering: How feasible is it to get the above to fail early with an > informative exception/Error? From peter.levart at gmail.com Thu Apr 5 14:40:28 2018 From: peter.levart at gmail.com (Peter Levart) Date: Thu, 5 Apr 2018 16:40:28 +0200 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> Message-ID: <67674e17-1e86-252f-72f2-9dd6ab78c03e@gmail.com> Hi, On 04/04/2018 07:07 PM, Brian Goetz wrote: > The intended implementation strategy is to lower complex switches to > densely-numbered `int` switches, and then invoke a classifier function > that takes a target and returns the int corresponding to the lowered > case number.? The classifier function will be an `invokedynamic`, > whose static bootstrap will contain a summary of the patterns.? (We've > already done this for switches on strings, enums, longs, non-dense > ints, etc.) > > To deliver an early error, that means that (a) the compiler must > encode through the static argument list all the assumptions it needs > verified at runtime (e.g., `String <: Object`), and (b) at linkage > time (the first time the switch is executed), those have to be tested. > > Doing so is plenty easy, but there's a startup cost, which could be as > bad as _O(n^2)_, if I have to validate that no two case labels are > ordered inconsistently with subtyping. Not necessarily. O(n log n) at worst for stable-sorting n cases which, if already sorted in compile time (i.e. no subtype changes between compile and link time), are resorted using just n-1 comparisons. That's if you want to "fix" the order of cases at link-time in order to compute optimal dispatch logic. If you only want to verify and bail-out if they are not sorted already (i.e. you only accept changes in type hierarchy that don't change order of cases), you always need just n-1 comparisons. The question is whether you only want to re-order / check-order according to type hierarchy or also according to other aspects of "dominance", for example: case Point p where (p.x >= 0 && p.y >= 0): ... case Point p where (p.x >= 0): ... Other aspects of dominance usually don't change between compile and link time, so stable-sorting cases could take just type hierarchy into account, unless you also allow type-hierarchy based conditions in where patterns, for example: case Holder h where (h.value instanceof TypeA): ... case Holder h where (h.value instanceof TypeB): ... Another problem with re-ordering cases at link time is when you support fall-through. What are fall-through(s) in a switch with re-ordered cases? For example: interface A {} interface B extends A {} switch (x) { ??? case B b: ??? ??? ... ??? ??? // fall-through... ??? case A a: ??? ??? A ab = ... ? a : b; ??? ??? ... What happens when you remove A from supertypes of B in a separately compiled code: interface A {} interface B {} Perhaps there's no need to worry about this as verifier would already catch such invalid code during runtime. So fall-through(s) could just stay the same even if cases are virtually reordered for the purpose of computing dispatch logic. The fall-through logic could sometimes survive changes in type hierarchy unnoticed by verifier but would give questionable results when executed. But that could be said for any logic, not necessarily concerned with switch statements. Here's some experiment I played with that clearly separates compile-time, link-time and run-time parts of logic and is just API. You can even simulate the effects of adding subtype relationship(s) between compile-time of switch and link-time: http://cr.openjdk.java.net/~plevart/misc/TypeSwitch/TypeSwitch.java Regards, Peter > > A possible mitigation is to do the check as a system assertion, which > only gets run if we are run with `-esa`; we then might still have some > static code bloat (depending on how we encode the assumptions), but at > least skip the dynamic check most of the time. > > On 4/4/2018 1:01 PM, Mark Raynsford wrote: >> I'm still giving thought to everything you've written, but I am >> wondering: How feasible is it to get the above to fail early with an >> informative exception/Error? > From forax at univ-mlv.fr Thu Apr 5 15:21:52 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 5 Apr 2018 17:21:52 +0200 (CEST) Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> Message-ID: <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "mark" > Cc: "amber-spec-experts" > Envoy?: Mercredi 4 Avril 2018 19:07:17 > Objet: Re: Compile-time type hierarchy information in pattern switch > The intended implementation strategy is to lower complex switches to > densely-numbered `int` switches, and then invoke a classifier function > that takes a target and returns the int corresponding to the lowered > case number.? The classifier function will be an `invokedynamic`, whose > static bootstrap will contain a summary of the patterns.? (We've already > done this for switches on strings, enums, longs, non-dense ints, etc.) > > To deliver an early error, that means that (a) the compiler must encode > through the static argument list all the assumptions it needs verified > at runtime (e.g., `String <: Object`), and (b) at linkage time (the > first time the switch is executed), those have to be tested. > > Doing so is plenty easy, but there's a startup cost, which could be as > bad as _O(n^2)_, if I have to validate that no two case labels are > ordered inconsistently with subtyping. > > A possible mitigation is to do the check as a system assertion, which > only gets run if we are run with `-esa`; we then might still have some > static code bloat (depending on how we encode the assumptions), but at > least skip the dynamic check most of the time. Or we can not try to do any check at runtime that validate the view of the world at compile time. Currently, there is no check that verifies that the catch are in the right order or that a cascade of if-instanceofs means the same thing at compile time and at runtime. My opinion, we should just run the code that was compiled, even if the world as changed between the compilation and the execution. R?mi > > On 4/4/2018 1:01 PM, Mark Raynsford wrote: >> I'm still giving thought to everything you've written, but I am >> wondering: How feasible is it to get the above to fail early with an > > informative exception/Error? From brian.goetz at oracle.com Thu Apr 5 15:25:36 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 5 Apr 2018 11:25:36 -0400 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> Message-ID: <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> Yes, this is surely an option. But it doesn't answer the underlying question -- if the hierarchy changes in various ways between compile and runtime, what behavior can the user count on, and what changes yield "undefined" behavior? While its easy to say "you should do what the code says", taking that too far ties tie our hands behind our back, and makes switches that should be O(1) into O(n). On 4/5/2018 11:21 AM, Remi Forax wrote: > Or we can not try to do any check at runtime that validate the view of the world at compile time. > Currently, there is no check that verifies that the catch are in the right order or that a cascade of if-instanceofs means the same thing at compile time and at runtime. > > My opinion, we should just run the code that was compiled, even if the world as changed between the compilation and the execution. From forax at univ-mlv.fr Thu Apr 5 15:40:59 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 5 Apr 2018 17:40:59 +0200 (CEST) Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> Message-ID: <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "mark" , "amber-spec-experts" > Envoy?: Jeudi 5 Avril 2018 17:25:36 > Objet: Re: Compile-time type hierarchy information in pattern switch > Yes, this is surely an option. > > But it doesn't answer the underlying question -- if the hierarchy > changes in various ways between compile and runtime, what behavior can > the user count on, and what changes yield "undefined" behavior? no, it's not undefined, at least not an "undefined behavior" as in C. At runtime, the code executed will be the one compiled. A hierarchy changes is not a backward compatible changes, so one can expect surprise and not something undefined. > > While its easy to say "you should do what the code says", taking that > too far ties tie our hands behind our back, and makes switches that > should be O(1) into O(n). ???, not sure to understand. If we record which case was executed for a given class in a hashmap and use it as a cache, it will be always O(1) for all subsequent calls with the same class. R?mi > > On 4/5/2018 11:21 AM, Remi Forax wrote: >> Or we can not try to do any check at runtime that validate the view of the world >> at compile time. >> Currently, there is no check that verifies that the catch are in the right order >> or that a cascade of if-instanceofs means the same thing at compile time and at >> runtime. >> >> My opinion, we should just run the code that was compiled, even if the world as > > changed between the compilation and the execution. From amaembo at gmail.com Thu Apr 5 19:41:16 2018 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 5 Apr 2018 22:41:16 +0300 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> Message-ID: Hello! Is it too harsh to reject the whole class if the assumptions on class hierarchy which were necessary to compile the switch statements used in the class are not valid at runtime? E.g. compiler may gather all the assumptions across all the pattern-matching switches within the class and add some instructions to the which check these assumptions at once (probably calling some validation method which receives the expected hierarchy in some packed way)? This way the fail-fast behavior will be guaranteed (class refuses to initialize) and while some expensive runtime checks are to be made during class initialization, in case of several pattern switches in the same class, the number of checks will be reduced (although they still will be performed even if no such switch is actually executed). With best regards, Tagir Valeev. On Thu, Apr 5, 2018 at 6:40 PM, wrote: > ----- Mail original ----- > > De: "Brian Goetz" > > ?: "Remi Forax" > > Cc: "mark" , "amber-spec-experts" < > amber-spec-experts at openjdk.java.net> > > Envoy?: Jeudi 5 Avril 2018 17:25:36 > > Objet: Re: Compile-time type hierarchy information in pattern switch > > > Yes, this is surely an option. > > > > But it doesn't answer the underlying question -- if the hierarchy > > changes in various ways between compile and runtime, what behavior can > > the user count on, and what changes yield "undefined" behavior? > > no, it's not undefined, at least not an "undefined behavior" as in C. > At runtime, the code executed will be the one compiled. A hierarchy > changes is not a backward compatible changes, so one can expect surprise > and not something undefined. > > > > > While its easy to say "you should do what the code says", taking that > > too far ties tie our hands behind our back, and makes switches that > > should be O(1) into O(n). > > ???, not sure to understand. > If we record which case was executed for a given class in a hashmap and > use it as a cache, it will be always O(1) for all subsequent calls with the > same class. > > R?mi > > > > > On 4/5/2018 11:21 AM, Remi Forax wrote: > >> Or we can not try to do any check at runtime that validate the view of > the world > >> at compile time. > >> Currently, there is no check that verifies that the catch are in the > right order > >> or that a cascade of if-instanceofs means the same thing at compile > time and at > >> runtime. > >> > >> My opinion, we should just run the code that was compiled, even if the > world as > > > changed between the compilation and the execution. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 5 19:42:12 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 5 Apr 2018 15:42:12 -0400 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <67674e17-1e86-252f-72f2-9dd6ab78c03e@gmail.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <67674e17-1e86-252f-72f2-9dd6ab78c03e@gmail.com> Message-ID: > That's if you want to "fix" the order of cases at link-time in order > to compute optimal dispatch logic. If you only want to verify and > bail-out if they are not sorted already (i.e. you only accept changes > in type hierarchy that don't change order of cases), you always need > just n-1 comparisons. Perhaps I'm dense, but I don't see this.? Suppose I have completely unrelated interfaces I, J, K, and L.? The user says: ??? case I: ??? case J: ??? case K: ??? case L: which is fine because they're unordered.? At runtime, any of the following type relations could have been injected: ??? J <: I, K <: I, L <: I ??? K <: J, L <: J ??? L <: K and these would cause the switch to be misordered (and would have been rejected at compile time.) How am I to detect any of these with just three comparisons?? If I pick the obvious n-1 (compare each to their neighbor) I wouldn't detect any of { L <: J, K <: I, L <: I }. Skipping ahead, yes, guards do play part in the ordering, and (a) we can't detect changes to data in at runtime and (b) we can't even necessarily order the guards.? But we can detect changes to type tests at runtime.? The question is whether we should. > Another problem with re-ordering cases at link time is when you > support fall-through. What are fall-through(s) in a switch with > re-ordered cases? Our story here is straightforward; we lower a switch whose labels are patterns to a switch whose labels are ints, and encode the patterns (or parts of them) as the static bootstrap arguments of the classifier bootstrap (just a more sophisticated version of what we do for longs, strings, and enums, as discussed previously.)? The classifier spits out a number, and int switch mechanics does the rest.? The question is to what degree we can rely on the compile-time assertion that the inputs are topologically sorted. From brian.goetz at oracle.com Thu Apr 5 19:44:30 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 5 Apr 2018 15:44:30 -0400 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> Message-ID: <32a49502-952d-bb60-5572-b356794530e5@oracle.com> > Is it too harsh to reject the whole class if the assumptions on class > hierarchy which were necessary to compile the switch statements used > in the class are not valid at runtime? That is one of the questions!? And the other question is: is this too expensive to do this check at runtime, given that it will fail so infrequently. If we can detect it cheaply enough, though, we can also repair the situation and fall back to linear testing of patterns.? This seems better (we can execute the statement the user wrote) than failing. My real question is can I punt on trying to detect it, and still optimize the common cases? down to O(1) dispatch.... From forax at univ-mlv.fr Thu Apr 5 20:28:19 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 5 Apr 2018 22:28:19 +0200 (CEST) Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <32a49502-952d-bb60-5572-b356794530e5@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> <32a49502-952d-bb60-5572-b356794530e5@oracle.com> Message-ID: <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Tagir Valeev" , "amber-spec-experts" > Envoy?: Jeudi 5 Avril 2018 21:44:30 > Objet: Re: Compile-time type hierarchy information in pattern switch >> Is it too harsh to reject the whole class if the assumptions on class >> hierarchy which were necessary to compile the switch statements used >> in the class are not valid at runtime? > > That is one of the questions!? And the other question is: is this too > expensive to do this check at runtime, given that it will fail so > infrequently. > > If we can detect it cheaply enough, though, we can also repair the > situation and fall back to linear testing of patterns.? This seems > better (we can execute the statement the user wrote) than failing. My > real question is can I punt on trying to detect it, and still optimize > the common cases? down to O(1) dispatch.... the way to detect it is to use the DAG of the supertypes (lazily constructed*), from the last to the first case, the idea is to propagate the index of down to the super types, if during the propagation, you find a supertype which is also a case and with an index lower that the currently propagated, then it's a failure. R?mi * you do not have to actually create the DAG, just be able to traverse it from the subtype to the supertypes. From peter.levart at gmail.com Thu Apr 5 21:06:50 2018 From: peter.levart at gmail.com (Peter Levart) Date: Thu, 5 Apr 2018 23:06:50 +0200 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <67674e17-1e86-252f-72f2-9dd6ab78c03e@gmail.com> Message-ID: On 04/05/18 21:42, Brian Goetz wrote: > >> That's if you want to "fix" the order of cases at link-time in order >> to compute optimal dispatch logic. If you only want to verify and >> bail-out if they are not sorted already (i.e. you only accept changes >> in type hierarchy that don't change order of cases), you always need >> just n-1 comparisons. > > Perhaps I'm dense, but I don't see this.? Suppose I have completely > unrelated interfaces I, J, K, and L.? The user says: > > ??? case I: > ??? case J: > ??? case K: > ??? case L: > > which is fine because they're unordered.? At runtime, any of the > following type relations could have been injected: > > ??? J <: I, K <: I, L <: I > ??? K <: J, L <: J > ??? L <: K > > and these would cause the switch to be misordered (and would have been > rejected at compile time.) > > How am I to detect any of these with just three comparisons?? If I > pick the obvious n-1 (compare each to their neighbor) I wouldn't > detect any of { L <: J, K <: I, L <: I }. You're right. Linear sorting would not help as there's no total order that could be derived from subtyping relationships. But as you say at the end, subtyping relationships form a directed acyclic graph on which you can perform topological sorting in linear time. Let's start with a list of cases that have already been ordered topologically at compile time. Say I, J, K, L (as in your example above). The types could be completely unrelated or there could be type relationships among them. Let's add to them synthetic "subtype" relationships (marked with <. to distinguish them from real subtype relationships <:) according to compile-time order of cases): I <. J J <. K K <. L Together with real direct subtype relationships, those form a graph. We just have to find out if this graph is acyclic or not. If it does not have a cycle, the order of case(s) is still OK and the switch is still valid. Otherwise the subtype relationships have changed in a way that makes the compile-time order of cases invalid. Finding cycle can be performed in linear time. Have I missed something this time too? Regards, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.levart at gmail.com Thu Apr 5 21:20:11 2018 From: peter.levart at gmail.com (Peter Levart) Date: Thu, 5 Apr 2018 23:20:11 +0200 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> <32a49502-952d-bb60-5572-b356794530e5@oracle.com> <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> Message-ID: <857ce394-f249-c1e9-27b7-223697078744@gmail.com> On 04/05/18 22:28, Remi Forax wrote: > the way to detect it is to use the DAG of the supertypes (lazily constructed*), from the last to the first case, the idea is to propagate the index of down to the super types, if during the propagation, you find a supertype which is also a case and with an index lower that the currently propagated, then it's a failure. > > R?mi > > * you do not have to actually create the DAG, just be able to traverse it from the subtype to the supertypes. Yes, this idea is similar to mine. We just have to find a conflict between subtype relationships and compile time order of cases which could be viewed as forming implicit pair-by-pair relationships of consecutive cases. If there's a cycle, we have a conflict. Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Apr 5 23:49:59 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 6 Apr 2018 01:49:59 +0200 (CEST) Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> Message-ID: <1078249793.557744.1522972199554.JavaMail.zimbra@u-pem.fr> I've implemented a first version https://github.com/forax/exotic/blob/master/src/main/java/com.github.forax.exotic/com/github/forax/exotic/TypeSwitch.java https://github.com/forax/exotic/blob/master/src/main/java/com.github.forax.exotic/com/github/forax/exotic/TypeSwitchCallSite.java (i've changed the convention to be null -> -1, unknown -> -2 because it's easier to do the nullcheck upfront instead at the end) and i've written a small JMH benchmark https://github.com/forax/exotic/blob/master/src/test/java/com.github.forax.exotic/com/github/forax/exotic/perf/TypeSwitchBenchMark.java that compare the type-switch with a cascade of if ... else. I've found (on my laptop, so it's may be not true on a server) that the speed depends on - the number of cases - the number of different classes a switch can see at runtime. The current implementation is independent on the number of cases, it uses an inlining cache of 'if getClass', which is great if there are few classes at runtime and change itself to use a ClassValue if there are too many classes at runtime. Benchmark Mode Cnt Score Error Units TypeSwitchBenchMark.long_instanceof_cascade avgt 15 358.876 ? 2.868 ns/op TypeSwitchBenchMark.long_type_switch avgt 15 49.870 ? 0.702 ns/op TypeSwitchBenchMark.short_instanceof_cascade avgt 15 7.016 ? 0.017 ns/op TypeSwitchBenchMark.short_type_switch avgt 15 5.978 ? 0.054 ns/op I think the current implementation is not enough because the cost of using a ClassValue is quite high so if there are few cases and quite a lot of different classes at runtime, the implementation should switch to use a cascade of instanceof instead of using a ClassValue. What should be implemented in my opinion is something like that: number of classes seen at runtime small | big | small if getClass | if instanceof number of cases --------------------------------------------- big if getClass | ClassValue.get | And once we have an implementation a little more realistic, we can implement the verification (see my previous mail) to see its impact. cheers, R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mardi 3 Avril 2018 18:36:43 > Objet: Compile-time type hierarchy information in pattern switch > Along the lines of the previous discussion about separate compilation > skew with enums ... I'm trying to find the right place to draw the line > with respect to post-compilation class hierarchy changes. > > Recall that we can impose a _dominance ordering_ on patterns; pattern P > dominates Q if everything that is matched by Q also is matched by P. > We already use this today, in catch blocks, to reject programs with dead > code; you can't say `catch Exception` before `catch IOException`, > because the latter block would be dead. We want to do the same with > patterns, so: > > case String x: ... > case Object x: ... > > is OK but > > case Object x: ... > case String x: ... > > is rejected at compile time. > > Separately, we'd like for pattern matching to be efficient; the > definition of "inefficient" would be for pattern matching to be > inherently O(n), when we can frequently do much better. There's plenty > of literature on compiling patterns to decision trees, but none of them > address the problem we have to: separate compilation. So any decision > tree computed at compile time might be wrong in undesirable ways by > runtime. We could also compute a decision tree at runtime using indy; > while this is our intent, the devil is in the details. We don't want > computing the tree to be too expensive, nor do we want to have to > capture O(n^2) compile-time constraints to be validated at runtime. So > I'd like to focus on what changes we're willing to accept between > compilation and runtime, what our expectations would be for those changes. > > We've already discussed one of these: novel values in enum / sealed type > switches, and for them, the answer is throwing some sort of exception. > Another that we dealt with long ago is changing enum ordinals; we > decided at the time that we're willing for this to be a BC change, so we > generate extra code that uses the as-runtime ordinals rather than the > as-compile-time ordinals when lowering the switch into an integer > switch. (If we weren't willing to tolerate such changes, we'd have a > simpler translation: just lower an enum switch to a switch on its > ordinal.) > > Here's one that I suspect we're not expecting to recover terribly well > from: hierarchy inversion. Suppose at compile time A <: B. So the > following is a sensible switch body: > > case String: println("String"); break; > case Object: println("Object"); break; > > Now, imagine that by runtime, String no longer extends Object, but > instead Object absurdly extends String. Do we still expect the above to > print String for all Strings, and Object for everything else? Or is the > latter arm now dead at runtime, even though it wouldn't compile after > the change? Or is this now UB, because it would no longer compile? > > A more realistic example of a hierarchy change is introducing an > interface. If we have: > > interface I { } > class C { } > > and a switch > > case I: ... > case C: ... > > and later, we make C implement I, we have a similar situation; the > switch would no longer compile. Are we allowed to make optimizations > based on the compile-time knowledge that C > As an example, suppose A, B, C, ... Z are final classes, and I is an > interface implemented by none of them. Then I can dispatch: > > case A: ... > case B: ... > ... > case I: ... > ... > case Z: ... > case Object: ... > > in two type operations; hash the class of the target and look it up in a > table containing A...Z, and then do a test against I. However, if I'm > required to deal with the case where some of A..Z are retrofitted to > implement I after compile time, and I'm expected to process the switch > in order based on how it is written, then I have to fall back to O(1) > type operations at runtime, or, I have to do as many as O(n^2) type > comparisons at link time. These are steep cliffs to fall off of. > (Mandating throwing an exception at link time is also expensive.) > > Today, all switch cases are totally unordered, so we're free to execute > them in O(1) time. I'd like for that to continue to be the case, even > as we add more complex switches. > > So, let's have a conversation about expectations for what we should do > for a switch at runtime that would no longer compile due to > post-compilation hierarchy changes (new supertypes, hierarchy > inversions, removed supertypes, final <--> nonfinal, etc.) From peter.levart at gmail.com Fri Apr 6 09:01:23 2018 From: peter.levart at gmail.com (Peter Levart) Date: Fri, 6 Apr 2018 11:01:23 +0200 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <857ce394-f249-c1e9-27b7-223697078744@gmail.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> <32a49502-952d-bb60-5572-b356794530e5@oracle.com> <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> <857ce394-f249-c1e9-27b7-223697078744@gmail.com> Message-ID: <9367cb97-12b0-b6ac-ca18-fece90e027b5@gmail.com> On 04/05/2018 11:20 PM, Peter Levart wrote: > > > On 04/05/18 22:28, Remi Forax wrote: >> the way to detect it is to use the DAG of the supertypes (lazily constructed*), from the last to the first case, the idea is to propagate the index of down to the super types, if during the propagation, you find a supertype which is also a case and with an index lower that the currently propagated, then it's a failure. >> >> R?mi >> >> * you do not have to actually create the DAG, just be able to traverse it from the subtype to the supertypes. > > Yes, this idea is similar to mine. We just have to find a conflict > between subtype relationships and compile time order of cases which > could be viewed as forming implicit pair-by-pair relationships of > consecutive cases. If there's a cycle, we have a conflict. > And Remi's algorithm is of course the best implementation of this search. Here's a variant that does not need an index, just a set of types: start with an empty set S for each case type T from the last case up to the first: ??? if S contains T: ??? ??? bail out with error ??? add T and all its supertypes to S The time complexity of this algorithm is O(n). It takes at most n * k lookups into a (hash)set where k is an average number of supertypes of a case type. Usually, when case types share common supertypes not far-away, the algorithm can prune branches in type hierarchy already visited. Implementation-wise, if the algorithm uses a HashMap, mapping visited type to case type it was visited from (back to Remi's index of case), it can also produce a meaningful diagnostic message, mentioning precisely which two cases are in wrong order according to type hierarchy: ??? Class[] caseTypes = ...; ??? TypeVisitor visitor = new TypeVisitor(); ??? for (int i = caseTypes.length - 1; i >= 0; i++) { ??? ??? visitor.visitType(caseTypes[i]); ??? } ??? class TypeVisitor extends HashMap, Class> { ??????? void visitType(Class caseType) { ??????????? Class conflictingCaseType = putIfAbsent(caseType, caseType); ??????????? if (conflictingCaseType != null) { ??????????????? throw new IllegalStateException( ??????????????????? "Case " + conflictingCaseType.getName() + ??????????????????? " matches a subtype of what case " + caseType.getName() + ??????????????????? " matches but is located after it"); ??????????? } ??????????? visitSupertypes(caseType, caseType); ??????? } ??????? private void visitSupertypes(Class type, Class caseType) { ??????????? Class superclass = type.getSuperclass(); ??????????? if (superclass != null && putIfAbsent(superclass, caseType) == null) { ??????????????? visitSupertypes(superclass, caseType); ??????????? } ??????????? for (Class superinterface : type.getInterfaces()) { ??????????????? if (putIfAbsent(superinterface, caseType) == null) { ??????????????????? visitSupertypes(superinterface, caseType); ??????????????? } ??????????? } ??????? } ??? } Regards, Peter From brian.goetz at oracle.com Fri Apr 6 12:48:50 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 6 Apr 2018 08:48:50 -0400 Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <9367cb97-12b0-b6ac-ca18-fece90e027b5@gmail.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <20180404170105.5486a1c7@copperhead.int.arc7.info> <0e38a01f-b9bb-7539-5e0e-1df02f33d69f@oracle.com> <1510056942.420626.1522941712175.JavaMail.zimbra@u-pem.fr> <9bbda7e1-62be-e737-e277-c437da3c241b@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> <32a49502-952d-bb60-5572-b356794530e5@oracle.com> <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> <857ce394-f249-c1e9-27b7-223697078744@gmail.com> <9367cb97-12b0-b6ac-ca18-fece90e027b5@gmail.com> Message-ID: <9f103cbc-5750-c58b-6d29-33c14ca7b45d@oracle.com> This may be O(n), but its not really something I want to do when linking a call site... On 4/6/2018 5:01 AM, Peter Levart wrote: > > > On 04/05/2018 11:20 PM, Peter Levart wrote: >> >> >> On 04/05/18 22:28, Remi Forax wrote: >>> the way to detect it is to use the DAG of the supertypes (lazily >>> constructed*), from the last to the first case, the idea is to >>> propagate the index of down to the super types, if during the >>> propagation, you find a supertype which is also a case and with an >>> index lower that the currently propagated, then it's a failure. >>> >>> R?mi >>> >>> * you do not have to actually create the DAG, just be able to >>> traverse it from the subtype to the supertypes. >> >> Yes, this idea is similar to mine. We just have to find a conflict >> between subtype relationships and compile time order of cases which >> could be viewed as forming implicit pair-by-pair relationships of >> consecutive cases. If there's a cycle, we have a conflict. >> > > And Remi's algorithm is of course the best implementation of this > search. Here's a variant that does not need an index, just a set of > types: > > start with an empty set S > for each case type T from the last case up to the first: > ??? if S contains T: > ??? ??? bail out with error > ??? add T and all its supertypes to S > > The time complexity of this algorithm is O(n). It takes at most n * k > lookups into a (hash)set where k is an average number of supertypes of > a case type. Usually, when case types share common supertypes not > far-away, the algorithm can prune branches in type hierarchy already > visited. Implementation-wise, if the algorithm uses a HashMap, mapping > visited type to case type it was visited from (back to Remi's index of > case), it can also produce a meaningful diagnostic message, mentioning > precisely which two cases are in wrong order according to type hierarchy: > > ??? Class[] caseTypes = ...; > ??? TypeVisitor visitor = new TypeVisitor(); > ??? for (int i = caseTypes.length - 1; i >= 0; i++) { > ??? ??? visitor.visitType(caseTypes[i]); > ??? } > > ??? class TypeVisitor extends HashMap, Class> { > > ??????? void visitType(Class caseType) { > ??????????? Class conflictingCaseType = putIfAbsent(caseType, > caseType); > ??????????? if (conflictingCaseType != null) { > ??????????????? throw new IllegalStateException( > ??????????????????? "Case " + conflictingCaseType.getName() + > ??????????????????? " matches a subtype of what case " + > caseType.getName() + > ??????????????????? " matches but is located after it"); > ??????????? } > ??????????? visitSupertypes(caseType, caseType); > ??????? } > > ??????? private void visitSupertypes(Class type, Class caseType) { > ??????????? Class superclass = type.getSuperclass(); > ??????????? if (superclass != null && putIfAbsent(superclass, > caseType) == null) { > ??????????????? visitSupertypes(superclass, caseType); > ??????????? } > ??????????? for (Class superinterface : type.getInterfaces()) { > ??????????????? if (putIfAbsent(superinterface, caseType) == null) { > ??????????????????? visitSupertypes(superinterface, caseType); > ??????????????? } > ??????????? } > ??????? } > ??? } > > > Regards, Peter > From forax at univ-mlv.fr Fri Apr 6 13:10:20 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 6 Apr 2018 15:10:20 +0200 (CEST) Subject: Compile-time type hierarchy information in pattern switch In-Reply-To: <9f103cbc-5750-c58b-6d29-33c14ca7b45d@oracle.com> References: <2a815079-881a-4a79-592e-7f86a90cae88@oracle.com> <11186599.458653.1522942859001.JavaMail.zimbra@u-pem.fr> <32a49502-952d-bb60-5572-b356794530e5@oracle.com> <712022705.545287.1522960099581.JavaMail.zimbra@u-pem.fr> <857ce394-f249-c1e9-27b7-223697078744@gmail.com> <9367cb97-12b0-b6ac-ca18-fece90e027b5@gmail.com> <9f103cbc-5750-c58b-6d29-33c14ca7b45d@oracle.com> Message-ID: <1102362626.806180.1523020220830.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Peter Levart" , "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Vendredi 6 Avril 2018 14:48:50 > Objet: Re: Compile-time type hierarchy information in pattern switch > This may be O(n), but its not really something I want to do when linking > a call site... I agree :) Anyway, i've implemented the algorithm of Peter, fix a typo (i++ instead of i--) and add the fact that conceptually the supertype of an interface is java.lang.Object https://github.com/forax/exotic/blob/master/src/main/java/com.github.forax.exotic/com/github/forax/exotic/TypeSwitch.java#L96 I've also implemented a strategy that use if instanceof but it doesn't perform well, i suppose it's because with a real instanceof the VM gather a profile while with Class.isInstance(), it does not. It can be fixed by adding a special method handle combiner to java.lang.invoke.MethodHandles. That's said i may be wrong, it's perhaps something else. TypeSwitchBenchMark.big_big_instanceof_cascade avgt 15 367.409 ? 1.844 ns/op TypeSwitchBenchMark.big_big_type_switch avgt 15 52.238 ? 0.455 ns/op TypeSwitchBenchMark.small_big_instanceof_cascade avgt 15 34.940 ? 0.287 ns/op TypeSwitchBenchMark.small_big_type_switch avgt 15 52.204 ? 0.319 ns/op TypeSwitchBenchMark.small_small_instanceof_cascade avgt 15 7.320 ? 0.100 ns/op TypeSwitchBenchMark.small_small_type_switch avgt 15 6.122 ? 0.027 ns/op The first big/small is for the number of cases, the second is for the number of classes seen at runtime, so small_big_type_switch means a small number of cases with a lot of runtime classes. To summarize, if the number of classes seen at runtime is small, the type_switch wins (it uses if getClass), if there are a lot of cases, the type_switch wins (it uses ClassValue.get()) but if there are few cases and a lot of classes, the type_switch is behind :( R?mi > > On 4/6/2018 5:01 AM, Peter Levart wrote: >> >> >> On 04/05/2018 11:20 PM, Peter Levart wrote: >>> >>> >>> On 04/05/18 22:28, Remi Forax wrote: >>>> the way to detect it is to use the DAG of the supertypes (lazily >>>> constructed*), from the last to the first case, the idea is to >>>> propagate the index of down to the super types, if during the >>>> propagation, you find a supertype which is also a case and with an >>>> index lower that the currently propagated, then it's a failure. >>>> >>>> R?mi >>>> >>>> * you do not have to actually create the DAG, just be able to >>>> traverse it from the subtype to the supertypes. >>> >>> Yes, this idea is similar to mine. We just have to find a conflict >>> between subtype relationships and compile time order of cases which >>> could be viewed as forming implicit pair-by-pair relationships of >>> consecutive cases. If there's a cycle, we have a conflict. >>> >> >> And Remi's algorithm is of course the best implementation of this >> search. Here's a variant that does not need an index, just a set of >> types: >> >> start with an empty set S >> for each case type T from the last case up to the first: >> ??? if S contains T: >> ??? ??? bail out with error >> ??? add T and all its supertypes to S >> >> The time complexity of this algorithm is O(n). It takes at most n * k >> lookups into a (hash)set where k is an average number of supertypes of >> a case type. Usually, when case types share common supertypes not >> far-away, the algorithm can prune branches in type hierarchy already >> visited. Implementation-wise, if the algorithm uses a HashMap, mapping >> visited type to case type it was visited from (back to Remi's index of >> case), it can also produce a meaningful diagnostic message, mentioning >> precisely which two cases are in wrong order according to type hierarchy: >> >> ??? Class[] caseTypes = ...; >> ??? TypeVisitor visitor = new TypeVisitor(); >> ??? for (int i = caseTypes.length - 1; i >= 0; i++) { >> ??? ??? visitor.visitType(caseTypes[i]); >> ??? } >> >> ??? class TypeVisitor extends HashMap, Class> { >> >> ??????? void visitType(Class caseType) { >> ??????????? Class conflictingCaseType = putIfAbsent(caseType, >> caseType); >> ??????????? if (conflictingCaseType != null) { >> ??????????????? throw new IllegalStateException( >> ??????????????????? "Case " + conflictingCaseType.getName() + >> ??????????????????? " matches a subtype of what case " + >> caseType.getName() + >> ??????????????????? " matches but is located after it"); >> ??????????? } >> ??????????? visitSupertypes(caseType, caseType); >> ??????? } >> >> ??????? private void visitSupertypes(Class type, Class caseType) { >> ??????????? Class superclass = type.getSuperclass(); >> ??????????? if (superclass != null && putIfAbsent(superclass, >> caseType) == null) { >> ??????????????? visitSupertypes(superclass, caseType); >> ??????????? } >> ??????????? for (Class superinterface : type.getInterfaces()) { >> ??????????????? if (putIfAbsent(superinterface, caseType) == null) { >> ??????????????????? visitSupertypes(superinterface, caseType); >> ??????????????? } >> ??????????? } >> ??????? } >> ??? } >> >> >> Regards, Peter From brian.goetz at oracle.com Fri Apr 6 15:51:49 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 6 Apr 2018 11:51:49 -0400 Subject: Switch translation Message-ID: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> The following outlines our story for translating improved switches, including both the switch improvements coming as part of JEP 325, and follow-on work to add pattern matching to switches.? Much of this has been discussed already over the last year, but here it is in one place. # Switch Translation #### Maurizio Cimadamore and Brian Goetz #### April 2018 ## Part 1 -- constant switches This part examines the current translation of `switch` constructs by `javac`, and proposes a more general translation for switching on primitives, boxes, strings, and enums, with the goals of: ?- Unify the treatment of `switch` variants, simplifying the compiler implementation and reducing the static footprint of generated code; ?- Move responsibility for target classification from compile time to run time, allowing us to more freely update the logic without updating the compiler. ## Current translation Switches on `int` (and the smaller integer primitives) are translated in one of two ways.? If the labels are relatively dense, we translate an `int` switch to a `tableswitch`; if they are sparse, we translate to a `lookupswitch`.? The current heuristic appears to be that we use a `tableswitch` if it results in a smaller bytecode than a `lookupswitch` (which uses twice as many bytes per entry), which is a reasonable heuristic. #### Switches on boxes Switches on primitive boxes are currently implemented as if they were primitive switches, unconditionally unboxing the target before entry (possibly throwing NPE). #### Switches on strings Switches on strings are implemented as a two-step process, exploiting the fact that strings cache their `hashCode()` and that hash codes are reasonably spread out. Given a switch on strings like the one below: ??? switch (s) { ??????? case "Hello": ... ? ?? ?? case "World": ... ??????? default: ... ??? } The compiler desugar this into two separate switches, where the first switch maps the input strings into a range of numbers [0..1], as shown below, which can then be used in a subsequent plain switch on ints.? The generated code unconditionally calls `hashCode()`, again possibly throwing NPE. ??? int index=-1; ??? switch (s.hashCode()) { ? ?? ?? case 12345: if (!s.equals("Hello")) break; index = 1; break; ? ?? ?? case 6789: if (!s.equals("World")) break; index = 0; break; ? ?? ?? default: index = -1; ??? } ??? switch (index) { ? ?? ?? case 0: ... ??????? case 1: ... ? ?? ?? default: ... ??? } If there are hash collisions between the strings, the first switch must try all possible matching strings. #### Switches on enums Switches on `enum` constants exploit the fact that enums have (usually dense) integral ordinal values.? Unfortunately, because an ordinal value can change between compilation time and runtime, we cannot rely on this mapping directly, but instead need to do an extra layer of mapping.? Given a switch like: ??? switch(color) { ??????? case RED: ... ??????? case GREEN: ... ??? } The compiler numbers the cases starting a 1 (as with string switch), and creates a synthetic class that maps the runtime values of the enum ordinals to the statically numbered cases: ??? class Outer$0 { ??????? synthetic final int[] $EnumMap$Color = new int[Color.values().length]; ? ?? ?? static { ??? ? ? ? ? try { $EnumMap$Color[RED.ordinal()] = 1; } catch (NoSuchFieldError ex) {} ? ?? ? ?? ? try { $EnumMap$Color[GREEN.ordinal()] = 2; } catch (NoSuchFieldError ex) {} ??????? } ??? } Then, the switch is translated as follows: ??? switch(Outer$0.$EnumMap$Color[color.ordinal()]) { ??????? case 1: stmt1; ? ?? ?? case 2: stmt2 ??? } In other words, we construct an array whose size is the cardinality of the enum, and then the element at position *i* of such array will contain the case index corresponding to the enum constant with whose ordinal is *i*. ## A more general scheme The handling of strings and enums give us a hint of how to create a more regular scheme; for `switch` targets more complex than `int`, we lower the `switch` to an `int` switch with consecutive `case` labels, and use a separate process to map the target into the range of synthetic case labels. Now that we have `invokedynamic` in our toolbox, we can reduce all of the non-`int` cases to a single form, where we number the cases with consecutive integers, and perform case selection via an `invokedynamic`-based classifier function, whose static argument list receives a description of the actual targets, and which returns an `int` identifying what `case` to select. This approach has several advantages: ?- Reduced compiler complexity -- all switches follow a common pattern; ?- Reduced static code size; ?- The classification function can select from a wide range of strategies (linear search, binary search, building a `HashMap`, constructing a perfect hash function, etc), which can vary over time or from situation to situation; ?- We are free to improve the strategy or select an alternate strategy (say, to optimize for startup time) without having to recompile the code; ?- Hopefully at least, if not more, JIT-friendly than the existing translation. We can also use this approach in preference to `lookupswitch` for non-dense `int` switches, as well as use it to extend `switch` to handle `long`, `float`, and `double` targets (which were surely excluded in part because the JVM didn't provide a convenient translation target for these types.) #### Bootstrap design When designing the `invokedynamic` bootstraps to support this translation, we face the classic lumping-vs-splitting decision. For now, we'll bias towards splitting.? In the following example, `BOOTSTRAP_PREAMBLE` indicates the usual leading arguments for an indy bootstrap.? We assume the compiler has numbered the case values densely from 0..N, and the bootstrap will return [0,n) for success, or N for "no match". A strawman design might be: ??? // Numeric switches for P, accepts invocation as P -> I or Box(P) -> I ??? CallSite intSwitch(BOOTSTRAP_PREAMBLE, int... caseValues) ??? // Switch for String, invocation descriptor is String -> I ??? CallSite stringSwitch(BOOTSTRAP_PREAMBLE, String... caseValues) ??? // Switch for Enum, invocation descriptor is E -> I ??? CallSite enumSwitch(BOOTSTRAP_PREAMBLE, Class>> clazz, ??????????????????????? String... caseNames) It might be possible to encode all of these into a single bootstrap, but given that the compiler already treats each type slightly differently, it seems there is little value in this sort of lumping for non-pattern switches. The `enumSwitch` bootstrap as proposed uses `String` values to describe the enum constants, rather than encoding the enum constants directly via condy.? This allows us to be more robust to enums disappearing after compilation. This strategy is also dependent on having broken the limitation on 253 bootstrap arguments in indy/condy. #### Extending to other primitive types This approach extends naturally to other primitive types (long, double, float), by the addition of some more bootstraps (which need to deal with the additional complexities of infinity, NaN, etc): ??? CallSite longSwitch(BOOTSTRAP_PREAMBLE, long... caseValues) ??? CallSite floatSwitch(BOOTSTRAP_PREAMBLE, float... caseValues) ??? CallSite doubleSwitch(BOOTSTRAP_PREAMBLE, double... caseValues) #### Extending to null The scheme as proposed above does not explicitly handle nulls, which is a feature we'd like to have in `switch`.? There are a few ways we could add null handling into the API: ?- Split entry points into null-friendly or null-hostile switches; ?- Find a way to encode nulls in the array of case values (which can be done with condy); ?- Always treat null as a possible input and a distinguished output, and have the compiler ensure the switch can handle this distinguished output. The last strategy is appealing and straightforward; assign a sentinel value (-1) to `null`, and always return this sentinel when the input is null.? The compiler ensures that some case handles `null`, and if no case handles `null` then it inserts an implicit ??? case -1: throw new NullPointerException(); into the generated code. #### General example If we have a string switch: ??? switch (x) { ??????? case "Foo": m(); break; ??????? case "Bar": n(); // fall through ??????? case "Baz": r(); break; ??????? default: p(); ??? } we translate into: ??? int t = indy[bsm=stringSwitch["Foo", "Bar", "Baz"]](x) ??? switch (t) { ??????? case -1: throw new NullPointerException();? // implicit null case ??????? case 0: m(); break; ??????? case 1: n(); // fall through ??????? case 2: r(); break; ??????? case 3: p();??????????????????????????????? // default case ??? } All switches, with the exception of `int` switches (and maybe not even non-dense `int` switches), follow this exact pattern.? If the target type is not a reference type, the `null` case is not needed. This strategy is implemented in the `switch` branch of the amber repository; see `java.lang.runtime.SwitchBootstraps` in that branch for (rough!) implementations of the bootstraps. ## Patterns in narrow-target switches When we add patterns, we may encounter switches whose targets are tightly typed (e.g., `String` or `int`) but still use some patterns in their expression.? For switches whose target type is a primitive, primitive box, `String`, or `enum`, we'd like to use the optimized translation strategy outlined here, but the following kinds of patterns might still show up in a switch on, say, `Integer`: ??? case var x: ??? case _: ??? case Integer x: ??? case Integer(var x): The first three can be translated away by the source compiler, as they are semantically equivalent to `default`.? If any nontrivial patterns are present (including deconstruction patterns), we may need to translate as a pattern switch scheme -- see Part 2. (While the language may not distinguish between "legacy" and "pattern" switches -- in that all switches are pattern switches -- we'd like to avoid giving up obvious optimizations if we can.) # Part 2 -- type test patterns and guards A key motivation for reexamining switch translation is the impending arrival of patterns in switch.? We expect switch translation for the pattern case to follow a similar structure -- lower to an `int` switch and use an indy-based classifier to select an index.? However, there are a few additional complexities.? One is that pattern cases may have guards, which means we need to be able to re-enter the bootstrap with an indication to "continue matching from case N", in the event of a failed guard. (Even if the language doesn't support guards directly, the obvious implementation strategy for nested patterns is to desugar them into guards.) Translating pattern switches is more complicated because there are more options for how to divide the work between the statically generated code and the switch classifier, and different choices have different performance side-effects (are binding variables "boxed" into a tuple to be returned, or do they need to be redundantly calculated). ## Type-test patterns Type-test patterns are notable because their applicability predicate is purely based on the type system, meaning that the compiler can directly reason about it both statically (using flow analysis, optimizing away dynamic type tests) and dynamically (with `instanceof`.)? A switch involving type-tests: ??? switch (x) { ??????? case String s: ... ??????? case Integer i: ... ??????? case Long l: ... ??? } can (among other strategies) be translated into a chain of `if-else` using `instanceof` and casts: ??? if (x instanceof String) { String s = (String) x; ... } ??? else if (x instanceof Integer) { Integer i = (Integer) x; ... } ??? else if (x instanceof Long) { Long l = (Long) x; ... } #### Guards The `if-else` desugaring can also naturally handle guards: ??? switch (x) { ??????? case String s ??????????? where (s.length() > 0): ... ??????? case Integer i ??????????? where (i > 0): ... ??????? case Long l ??????????? where (l > 0L): ... ??? } can be translated to: ??? if (x instanceof String ??????? && ((String) x).length() > 0) { String s = (String) x; ... } ??? else if (x instanceof Integer ???????????? && ((Integer) x) > 0) { Integer i = (Integer) x; ... } ??? else if (x instanceof Long ???????????? && ((Long) x) > 0L) { Long l = (Long) x; ... } #### Performance concerns The translation to `if-else` chains is simple (for switches without fallthrough), but is harder for the VM to optimize, because we've used a more general control flow mechanism.? If the target is an empty `String`, which means we'd pass the first `instanceof` but fail the guard, class-hierarchy analysis could tell us that it can't possibly be an `Integer` or a `Long`, and so there's no need to perform those tests. But generating code that takes advantage of this information is more complex. In the extreme case, where a switch consists entirely of type test patterns for final classes, this could be performed as an O(1) operation by hashing.? And this is a common case involving switches over alternatives in a sum (sealed) type. (We shouldn't rely on finality at compile time, as this can change between compile and run time, but we should take advantage of this at run time if we can.) Finally, the straightforward static translation may miss opportunities for optimization.? For example: ??? switch (x) { ??????? case Point p ??????????? where p.x > 0 && p.y > 0: A ??????? case Point p ??????????? where p.x > 0 && p.y == 0: B ??? } Here, not only would we potentially test the target twice to see if it is a `Point`, but we then further extract the `x` component twice and perform the `p.x > 0` test twice. #### Optimization opportunities The compiler can eliminate some redundant calculations through straightforward techniques.? The previous switch can be transformed to: ??? switch (x) { ??????? case Point p: ??????????? if (((Point) p).x > 0 && ((Point) p).y > 0) { A } ??????????? else if (((Point) p).x > 0 && ((Point) p).y > 0) { B } to eliminate the redundant `instanceof` (and admits further CSE optimizations.) #### Clause reordering The above example was easy to transform because the two `case Point` clauses were adjacent.? But what if they are not?? In some cases, it is safe to reorder them.? For types `T` and `U`, it is safe to reorder `case T` and `case U` if the two types have no intersection; that there can be no types that are subtypes of them both.? This is true when `T` and `U` are classes and neither extends the other, or when one is a final class and the other is an interface that the class does not implement. The compiler could then reorder case clauses so that all the ones whose first test is `case Point` are adjacent, and then coalesce them all into a single arm of the `if-else` chain. A possible spoiler here is fallthrough; if case A falls into case B, then cases A and B have to be moved as a group.? (This is another reason to consider limiting fallthrough.) A bigger possible spoiler here is separate compilation.? If at compile time, we see that `T` and `U` are disjoint types, do we want to bake that assumption into the compilation, or do we have to re-check that assumption at runtime? #### Summary of if-else translation While the if-else translation at first looks pretty bad, we are able to extract a fair amount of redundancy through well-understood compiler transformations.? If an N-way switch has only M distinct types in it, in most cases we can reduce the cost from _O(N)_ to _O(M)_.? Sometimes _M == N_, so this doesn't help, but sometimes _M << N_ (and sometimes `N` is small, in which case _O(N)_ is fine.) Reordering clauses involves some risk; specifically, that the class hierarchy will change between compile and run time.? It seems eminently safe to reorder `String` and `Integer`, but more questionable to reorder an arbitrary class `Foo` with `Runnable`, even if `Foo` doesn't implement `Runnable` now, because it might easily be changed to do so later.? Ideally we'd like to perform class-hierarchy optimizations using the runtime hierarchy, not the compile-time hierarchy. ## Type classifiers The technique outlined in _Part 1_, where we lower the complex switch to a dense `int` switch, and use an indy-based classifier to select an index, is applicable here as well. First let's consider a switch consisting only of unguarded type-test patterns, optionally with a default clause. We'll start with an `indy` bootstrap whose static argument are `Class` constants corresponding to each arm of the switch, whose dynamic argument is the switch target, and whose return value is a case number (or distinguished sentinels for "no match" and `null`.)? We can easily implement such a bootstrap with a linear search, but can also do better; if some subset of the classes are `final`, we can choose between these more quickly (such as via binary search on `hashCode()`, hash function, or hash table), and we need perform only a single operation to test all of those at once. Dynamic techniques (such as a building a hash map of previously seen target types), which `indy` is well-suited to, can asymptotically approach _O(1)_ even when the classes involved are not final. So we can lower: ??? switch (x) { ??????? case T t: A ??????? case U u: B ??????? case V v: C ??? } to ??? int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x) ??? switch (y) { ??????? case 0: A ??????? case 1: B ??????? case 2: C ??? } This has the advantages that the generated code is very similar to the source code, we can (in some cases) get _O(1)_ dispatch performance, and we can handle fallthrough with no additional complexity. #### Guards There are two approaches we could take to add support for guards into the process; we could try to teach the bootstrap about guards (and would have to pass locals that appear in guard expressions as additional arguments to the classifier), or we could leave guards to the generated bytecode.? The latter seems far more attractive, but requires some tweaks to the bootstrap arguments and to the shape of the generated code. If the classifier says "you have matched case #3", but then we fail the guard for #3, we want to go back into the classifier and start again at #4.? (Sometimes the classifier can also use this information ("start over at #4") to optimize away unnecessary tests.) We add a second argument (where to start) to the classifier invocation signature, and wrap the switch in a loop, lowering: ??? switch (target) { ??????? case T t where (e1): A ??????? case T t where (e2): B ??????? case U u where (e3): C ??? } into ??? int index = -1; // start at the top ??? while (true) { ??????? index = indy[...](target, index) ? ? ??? switch (index) { ?? ? ?????? case 0: if (!e1) continue; A ? ?? ?????? case 1: if (!e2) continue; B ? ?? ?????? case 2: if (!e3) continue; C ??????????? default: break; ??????? } ??????? break; ??? } For cases where the same type test is repeated in consecutive positions (at N and N+1), we can have the static compiler coalesce them as above, or we could have the bootstrap maintain a table so that if you re-enter the bootstrap where the previous answer was N, then it can immediately return N+1.? Similarly, if N and N+1 are known to be mutually exclusive types (like `String` and `Integer`), on reentering the classifier with N, we can skip right to N+2 since if we matched `String`, we cannot match `Integer`. Lookup tables for such optimizations can be built at callsite linkage time. #### Mixing constants and type tests This approach also extends to tests that are a mix of constant patterns and type-test patterns, such as: ??? switch (x) { ??????? case "Foo": ... ??????? case 0L: ... ??????? case Integer i: ??? } We can extend the bootstrap protocol to accept constants as well as types, and it is a straightforward optimization to combine both type matching and constant matching in a single pass. ## Nested patterns Nested patterns are essentially guards; even if we don't expose guards in the language, we can desugar ??? case Point(0, var x): into the equivalent of ??? case Point(var a, var x) && a matches 0: using the same translation story as above -- use the classifier to select a candidate case arm based on the top-type of the pattern, and then do additional checks in the generated bytecode, and if the checks fail, continue and re-enter the classifier starting at the next case. #### Explicit continue An alternative to exposing guards is to expose an explicit `continue` statement in switch, which would have the effect of "keep matching at the next case."? Then guards could be expressed imperatively as: ??? case P: ??????? if (!guard) ??????????? continue; ??????? ... ??????? break; ??? case Q: ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Apr 6 16:45:52 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 6 Apr 2018 12:45:52 -0400 Subject: Switch translation In-Reply-To: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> References: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> Message-ID: <6FDE8CBD-14BA-4A18-9030-877E9C664194@oracle.com> Very comprehensive. Four groups of comments below (one at the very bottom). > On Apr 6, 2018, at 11:51 AM, Brian Goetz wrote: > > The following outlines our story for translating improved switches, including both the switch improvements coming as part of JEP 325, and follow-on work to add pattern matching to switches. Much of this has been discussed already over the last year, but here it is in one place. > > # Switch Translation > #### Maurizio Cimadamore and Brian Goetz > #### April 2018 > > ## Part 1 -- constant switches > > This part examines the current translation of `switch` constructs by `javac`, and proposes a more general translation for switching on primitives, boxes, strings, and enums, with the goals of: > > - Unify the treatment of `switch` variants, simplifying the compiler implementation and reducing the static footprint of generated code; > - Move responsibility for target classification from compile time to run time, allowing us to more freely update the logic without updating the compiler. > > ## Current translation > > Switches on `int` (and the smaller integer primitives) are translated in one of two ways. If the labels are relatively dense, we translate an `int` switch to a `tableswitch`; if they are sparse, we translate to a `lookupswitch`. The current heuristic appears to be that we use a `tableswitch` if it results in a smaller bytecode than a `lookupswitch` (which uses twice as many bytes per entry), which is a reasonable heuristic. > > #### Switches on boxes > > Switches on primitive boxes are currently implemented as if they were primitive switches, unconditionally unboxing the target before entry (possibly throwing NPE). > > #### Switches on strings > > Switches on strings are implemented as a two-step process, exploiting the fact that strings cache their `hashCode()` and that hash codes are reasonably spread out. Given a switch on strings like the one below: > > switch (s) { > case "Hello": ... > case "World": ... > default: ... > } > > The compiler desugar this into two separate switches, where the first switch maps the input strings into a range of numbers [0..1], as shown below, which can then be used in a subsequent plain switch on ints. The generated code unconditionally calls `hashCode()`, again possibly throwing NPE. > > int index=-1; > switch (s.hashCode()) { > case 12345: if (!s.equals("Hello")) break; index = 1; break; > case 6789: if (!s.equals("World")) break; index = 0; break; > default: index = -1; > } > switch (index) { > case 0: ... > case 1: ... > default: ... > } > > If there are hash collisions between the strings, the first switch must try all possible matching strings. Minor point: unclear why the default case has to assign -1 to index, when it it already initialized to -1. I see why you use this structure, because it fits a general paradigm of first mapping to an integer. However, a post-optimization might be able to turn such a structure, where the assignments to ?index? are all constants rather than the results of calling some opaque classifier method, into a control structure with a single switch statement and no use of the intermediate integer encoding. I?ll show a more general example to give you the idea: switch (s) { case "Hello": stmts1; case "World": stmts2; case ?Goodbye": stmts3; default: stmtsD; } and suppose that ?Hello? and ?Goodbye? happen to have the same hashcode. It might be transformed into: int index=-1; switch (s.hashCode()) { case 12345: if (s.equals("Hello")) index = 0; else if (s.equals(?Goodbye")) index = 2; break; case 6789: if (s.equals("World")) index = 1; break; default: break; } switch (index) { case 0: stmts1; case 1: stmts2; case 2: stmts3; default: stmtsD; } I now suggest that a post-optimization might then turn this into: SUCCESS: { DEFAULT: { switch (s.hashCode()) { case 12345: if (s.equals("Hello")) { stmts1; break SUCCESS; } else if (s.equals(?Goodbye")) { stmts3; break SUCCESS; } else break DEFAULT; case 6789: if (s.equals("World")) { stmts2; break SUCCESS; } else break DEFAULT; default: break DEFAULT; } break SUCCESS; } do { stmtsD; } while(0); } where `SUCCESS` and `DEFAULT` are suitably generated fresh statement labels. (You might think that simply SUCCESS: { switch (s.hashCode()) { case 12345: if (s.equals("Hello")) { stmts1; break SUCCESS; } else if (s.equals(?Goodbye")) { stmts3; break SUCCESS; } else break; case 6789: if (s.equals("World")) { stmts2; break SUCCESS; } else break; default: break; } stmtsD; } would do the job, but the first version, using a `DEFAULT` label and a dummy `do` statement, allows for stmts1, stmts2, stmts3, and stmtsD to contain `break` statements. Of course, that?s assuming use of only surface syntax to express the transformations; the compiler can probably be smarter than that in practice.) > #### Switches on enums > > Switches on `enum` constants exploit the fact that enums have (usually dense) integral ordinal values. Unfortunately, because an ordinal value can change between compilation time and runtime, we cannot rely on this mapping directly, but instead need to do an extra layer of mapping. Given a switch like: > > switch(color) { > case RED: ... > case GREEN: ... > } > > The compiler numbers the cases starting a 1 (as with string switch), and creates a synthetic class that maps the runtime values of the enum ordinals to the statically numbered cases: Inconsistency: in the string example above, you actually numbered the cases 0 and 1, not 1 and 2. > > class Outer$0 { > synthetic final int[] $EnumMap$Color = new int[Color.values().length]; > static { > try { $EnumMap$Color[RED.ordinal()] = 1; } catch (NoSuchFieldError ex) {} > try { $EnumMap$Color[GREEN.ordinal()] = 2; } catch (NoSuchFieldError ex) {} > } ? > } > > Then, the switch is translated as follows: > > switch(Outer$0.$EnumMap$Color[color.ordinal()]) { > case 1: stmt1; > case 2: stmt2 > } Presumably for this example the chosen integers start with 1 rather than 0, so that if any element of the array is not explicitly initialized by Outer$0, its default 0 value will not be confused with an actual enum value. This subtle point should be mentioned explicitly. An interesting question is whether a ?case 0: throw new Exception(?);? should be supplied, on the grounds that it?s okay for the programmer to ignore enum values presumably known at compile time, but not to ignore values that sneaked in later? (If the desire is really to ignore all such values, known and unknown, the programmer can always write ?default: break;?.) > In other words, we construct an array whose size is the cardinality of the enum, and then the element at position *i* of such array will contain the case index corresponding to the enum constant with whose ordinal is *i*. > > ## A more general scheme > > The handling of strings and enums give us a hint of how to create a more regular scheme; for `switch` targets more complex than `int`, we lower the `switch` to an `int` switch with consecutive `case` labels, and use a separate process to map the target into the range of synthetic case labels. > > Now that we have `invokedynamic` in our toolbox, we can reduce all of the non-`int` cases to a single form, where we number the cases with consecutive integers, and perform case selection via an `invokedynamic`-based classifier function, whose static argument list receives a description of the actual targets, and which returns an `int` identifying what `case` to select. > > This approach has several advantages: > - Reduced compiler complexity -- all switches follow a common pattern; > - Reduced static code size; > - The classification function can select from a wide range of strategies (linear search, binary search, building a `HashMap`, constructing a perfect hash function, etc), which can vary over time or from situation to situation; > - We are free to improve the strategy or select an alternate strategy (say, to optimize for startup time) without having to recompile the code; > - Hopefully at least, if not more, JIT-friendly than the existing translation. > > We can also use this approach in preference to `lookupswitch` for non-dense `int` switches, as well as use it to extend `switch` to handle `long`, `float`, and `double` targets (which were surely excluded in part because the JVM didn't provide a convenient translation target for these types.) > > #### Bootstrap design > > When designing the `invokedynamic` bootstraps to support this translation, we face the classic lumping-vs-splitting decision. For now, we'll bias towards splitting. In the following example, `BOOTSTRAP_PREAMBLE` indicates the usual leading arguments for an indy bootstrap. We assume the compiler has numbered the case values densely from 0..N, and the bootstrap will return [0,n) for success, or N for "no match". > > A strawman design might be: > > // Numeric switches for P, accepts invocation as P -> I or Box(P) -> I > CallSite intSwitch(BOOTSTRAP_PREAMBLE, int... caseValues) > > // Switch for String, invocation descriptor is String -> I > CallSite stringSwitch(BOOTSTRAP_PREAMBLE, String... caseValues) > > // Switch for Enum, invocation descriptor is E -> I > CallSite enumSwitch(BOOTSTRAP_PREAMBLE, Class>> clazz, > String... caseNames) > > It might be possible to encode all of these into a single bootstrap, but given that the compiler already treats each type slightly differently, it seems there is little value in this sort of lumping for non-pattern switches. > > The `enumSwitch` bootstrap as proposed uses `String` values to describe the enum constants, rather than encoding the enum constants directly via condy. This allows us to be more robust to enums disappearing after compilation. > > This strategy is also dependent on having broken the limitation on 253 bootstrap arguments in indy/condy. > > #### Extending to other primitive types > > This approach extends naturally to other primitive types (long, double, float), by the addition of some more bootstraps (which need to deal with the additional complexities of infinity, NaN, etc): > > CallSite longSwitch(BOOTSTRAP_PREAMBLE, long... caseValues) > CallSite floatSwitch(BOOTSTRAP_PREAMBLE, float... caseValues) > CallSite doubleSwitch(BOOTSTRAP_PREAMBLE, double... caseValues) > > #### Extending to null > > The scheme as proposed above does not explicitly handle nulls, which is a feature we'd like to have in `switch`. There are a few ways we could add null handling into the API: > > - Split entry points into null-friendly or null-hostile switches; > - Find a way to encode nulls in the array of case values (which can be done with condy); > - Always treat null as a possible input and a distinguished output, and have the compiler ensure the switch can handle this distinguished output. > > The last strategy is appealing and straightforward; assign a sentinel value (-1) to `null`, and always return this sentinel when the input is null. The compiler ensures that some case handles `null`, and if no case handles `null` then it inserts an implicit > > case -1: throw new NullPointerException(); > > into the generated code. > > #### General example > > If we have a string switch: > > switch (x) { > case "Foo": m(); break; > case "Bar": n(); // fall through > case "Baz": r(); break; > default: p(); > } > > we translate into: > > int t = indy[bsm=stringSwitch["Foo", "Bar", "Baz"]](x) > switch (t) { > case -1: throw new NullPointerException(); // implicit null case > case 0: m(); break; > case 1: n(); // fall through > case 2: r(); break; > case 3: p(); // default case > } > > All switches, with the exception of `int` switches (and maybe not even non-dense `int` switches), follow this exact pattern. If the target type is not a reference type, the `null` case is not needed. > > This strategy is implemented in the `switch` branch of the amber repository; see `java.lang.runtime.SwitchBootstraps` in that branch for (rough!) implementations of the bootstraps. > > ## Patterns in narrow-target switches > > When we add patterns, we may encounter switches whose targets are tightly typed (e.g., `String` or `int`) but still use some patterns in their expression. For switches whose target type is a primitive, primitive box, `String`, or `enum`, we'd like to use the optimized translation strategy outlined here, but the following kinds of patterns might still show up in a switch on, say, `Integer`: > > case var x: > case _: > case Integer x: > case Integer(var x): > > The first three can be translated away by the source compiler, as they are semantically equivalent to `default`. If any nontrivial patterns are present (including deconstruction patterns), we may need to translate as a pattern switch scheme -- see Part 2. (While the language may not distinguish between "legacy" and "pattern" switches -- in that all switches are pattern switches -- we'd like to avoid giving up obvious optimizations if we can.) > > # Part 2 -- type test patterns and guards > > A key motivation for reexamining switch translation is the impending arrival of patterns in switch. We expect switch translation for the pattern case to follow a similar structure -- lower to an `int` switch and use an indy-based classifier to select an index. However, there are a few additional complexities. One is that pattern cases may have guards, which means we need to be able to re-enter the bootstrap with an indication to "continue matching from case N", in the event of a failed guard. (Even if the language doesn't support guards directly, the obvious implementation strategy for nested patterns is to desugar them into guards.) > > Translating pattern switches is more complicated because there are more options for how to divide the work between the statically generated code and the switch classifier, and different choices have different performance side-effects (are binding variables "boxed" into a tuple to be returned, or do they need to be redundantly calculated). > > ## Type-test patterns > > Type-test patterns are notable because their applicability predicate is purely based on the type system, meaning that the compiler can directly reason about it both statically (using flow analysis, optimizing away dynamic type tests) and dynamically (with `instanceof`.) A switch involving type-tests: > > switch (x) { > case String s: ... > case Integer i: ... > case Long l: ... > } > > can (among other strategies) be translated into a chain of `if-else` using `instanceof` and casts: > > if (x instanceof String) { String s = (String) x; ... } > else if (x instanceof Integer) { Integer i = (Integer) x; ... } > else if (x instanceof Long) { Long l = (Long) x; ... } > > #### Guards > > The `if-else` desugaring can also naturally handle guards: > > switch (x) { > case String s > where (s.length() > 0): ... > case Integer i > where (i > 0): ... > case Long l > where (l > 0L): ... > } > > can be translated to: > > if (x instanceof String > && ((String) x).length() > 0) { String s = (String) x; ... } > else if (x instanceof Integer > && ((Integer) x) > 0) { Integer i = (Integer) x; ... } > else if (x instanceof Long > && ((Long) x) > 0L) { Long l = (Long) x; ... } > > #### Performance concerns > > The translation to `if-else` chains is simple (for switches without fallthrough), but is harder for the VM to optimize, because we've used a more general control flow mechanism. If the target is an empty `String`, which means we'd pass the first `instanceof` but fail the guard, class-hierarchy analysis could tell us that it can't possibly be an `Integer` or a `Long`, and so there's no need to perform those tests. But generating code that takes advantage of this information is more complex. > > In the extreme case, where a switch consists entirely of type test patterns for final classes, this could be performed as an O(1) operation by hashing. And this is a common case involving switches over alternatives in a sum (sealed) type. (We shouldn't rely on finality at compile time, as this can change between compile and run time, but we should take advantage of this at run time if we can.) > > Finally, the straightforward static translation may miss opportunities for optimization. For example: > > switch (x) { > case Point p > where p.x > 0 && p.y > 0: A > case Point p > where p.x > 0 && p.y == 0: B > } > > Here, not only would we potentially test the target twice to see if it is a `Point`, but we then further extract the `x` component twice and perform the `p.x > 0` test twice. > > #### Optimization opportunities > > The compiler can eliminate some redundant calculations through straightforward techniques. The previous switch can be transformed to: > > switch (x) { > case Point p: > if (((Point) p).x > 0 && ((Point) p).y > 0) { A } > else if (((Point) p).x > 0 && ((Point) p).y > 0) { B } > > to eliminate the redundant `instanceof` (and admits further CSE optimizations.) > > #### Clause reordering > > The above example was easy to transform because the two `case Point` clauses were adjacent. But what if they are not? In some cases, it is safe to reorder them. For types `T` and `U`, it is safe to reorder `case T` and `case U` if the two types have no intersection; that there can be no types that are subtypes of them both. This is true when `T` and `U` are classes and neither extends the other, or when one is a final class and the other is an interface that the class does not implement. > > The compiler could then reorder case clauses so that all the ones whose first test is `case Point` are adjacent, and then coalesce them all into a single arm of the `if-else` chain. > > A possible spoiler here is fallthrough; if case A falls into case B, then cases A and B have to be moved as a group. (This is another reason to consider limiting fallthrough.) > > A bigger possible spoiler here is separate compilation. If at compile time, we see that `T` and `U` are disjoint types, do we want to bake that assumption into the compilation, or do we have to re-check that assumption at runtime? > > #### Summary of if-else translation > > While the if-else translation at first looks pretty bad, we are able to extract a fair amount of redundancy through well-understood compiler transformations. If an N-way switch has only M distinct types in it, in most cases we can reduce the cost from _O(N)_ to _O(M)_. Sometimes _M == N_, so this doesn't help, but sometimes _M << N_ (and sometimes `N` is small, in which case _O(N)_ is fine.) > > Reordering clauses involves some risk; specifically, that the class hierarchy will change between compile and run time. It seems eminently safe to reorder `String` and `Integer`, but more questionable to reorder an arbitrary class `Foo` with `Runnable`, even if `Foo` doesn't implement `Runnable` now, because it might easily be changed to do so later. Ideally we'd like to perform class-hierarchy optimizations using the runtime hierarchy, not the compile-time hierarchy. > > ## Type classifiers > > The technique outlined in _Part 1_, where we lower the complex switch to a dense `int` switch, and use an indy-based classifier to select an index, is applicable here as well. First let's consider a switch consisting only of unguarded type-test patterns, optionally with a default clause. > > We'll start with an `indy` bootstrap whose static argument are `Class` constants corresponding to each arm of the switch, whose dynamic argument is the switch target, and whose return value is a case number (or distinguished sentinels for "no match" and `null`.) We can easily implement such a bootstrap with a linear search, but can also do better; if some subset of the classes are `final`, we can choose between these more quickly (such as via binary search on `hashCode()`, hash function, or hash table), and we need perform only a single operation to test all of those at once. Dynamic techniques (such as a building a hash map of previously seen target types), which `indy` is well-suited to, can asymptotically approach _O(1)_ even when the classes involved are not final. > > So we can lower: > > switch (x) { > case T t: A > case U u: B > case V v: C > } > > to > > int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x) > switch (y) { > case 0: A > case 1: B > case 2: C > } > > This has the advantages that the generated code is very similar to the source code, we can (in some cases) get _O(1)_ dispatch performance, and we can handle fallthrough with no additional complexity. > > #### Guards > > There are two approaches we could take to add support for guards into the process; we could try to teach the bootstrap about guards (and would have to pass locals that appear in guard expressions as additional arguments to the classifier), or we could leave guards to the generated bytecode. The latter seems far more attractive, but requires some tweaks to the bootstrap arguments and to the shape of the generated code. > > If the classifier says "you have matched case #3", but then we fail the guard for #3, we want to go back into the classifier and start again at #4. (Sometimes the classifier can also use this information ("start over at #4") to optimize away unnecessary tests.) > > We add a second argument (where to start) to the classifier invocation signature, and wrap the switch in a loop, lowering: > > switch (target) { > case T t where (e1): A > case T t where (e2): B > case U u where (e3): C > } > > into > > int index = -1; // start at the top > while (true) { > index = indy[...](target, index) > switch (index) { > case 0: if (!e1) continue; A > case 1: if (!e2) continue; B > case 2: if (!e3) continue; C > default: break; > } > break; > } > > For cases where the same type test is repeated in consecutive positions (at N and N+1), we can have the static compiler coalesce them as above, or we could have the bootstrap maintain a table so that if you re-enter the bootstrap where the previous answer was N, then it can immediately return N+1. Similarly, if N and N+1 are known to be mutually exclusive types (like `String` and `Integer`), on reentering the classifier with N, we can skip right to N+2 since if we matched `String`, we cannot match `Integer`. Lookup tables for such optimizations can be built at callsite linkage time. > > #### Mixing constants and type tests > > This approach also extends to tests that are a mix of constant patterns and type-test patterns, such as: > > switch (x) { > case "Foo": ... > case 0L: ... > case Integer i: > } > > We can extend the bootstrap protocol to accept constants as well as types, and it is a straightforward optimization to combine both type matching and constant matching in a single pass. > > ## Nested patterns > > Nested patterns are essentially guards; even if we don't expose guards in the language, we can desugar > > case Point(0, var x): > > into the equivalent of > > case Point(var a, var x) && a matches 0: > > using the same translation story as above -- use the classifier to select a candidate case arm based on the top-type of the pattern, and then do additional checks in the generated bytecode, and if the checks fail, continue and re-enter the classifier starting at the next case. > > #### Explicit continue > > An alternative to exposing guards is to expose an explicit `continue` statement in switch, which would have the effect of "keep matching at the next case." Then guards could be expressed imperatively as: > > case P: > if (!guard) > continue; > ... > break; > case Q: ? > A nice idea, but careful: it is already meaningful to write: while (?) { switch (?) { case 1: ? case 2: if (foo) continue; ? } } and expect the `continue` to start a new iteration of the `while` loop. Indeed, this fact was already exploited above under ?### Guards?. If you really want to introduce the idea of ?continuing a switch dispatch" into the surface syntax, even if only for expository purposes, let me suggest the form `continue switch;`. switch (?) { case P: if (!guard) continue switch; ... break; case Q: ? } ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 6 17:58:24 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 6 Apr 2018 13:58:24 -0400 Subject: Switch translation In-Reply-To: <6FDE8CBD-14BA-4A18-9030-877E9C664194@oracle.com> References: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> <6FDE8CBD-14BA-4A18-9030-877E9C664194@oracle.com> Message-ID: <2b413b07-1c48-216c-a010-552d6b75f52f@oracle.com> >> >> ??? int index=-1; >> ??? switch (s.hashCode()) { >> ? ?? ?? case 12345: if (!s.equals("Hello")) break; index = 1; break; >> ? ?? ?? case 6789: if (!s.equals("World")) break; index = 0; break; >> ? ?? ?? default: index = -1; >> ??? } >> ??? switch (index) { >> ? ?? ?? case 0: ... >> ??????? case 1: ... >> ? ?? ?? default: ... >> ??? } >> >> If there are hash collisions between the strings, the first switch >> must try all possible matching strings. > > I see why you use this structure, because it fits a general paradigm > of first mapping to an integer. Or, "used", since this is the historical strategy which we're tossing over for the indy-based one. > I now suggest that a post-optimization might then turn this into: > > ? SUCCESS: { > ? ? DEFAULT: { > ? ? ? switch (s.hashCode()) { > ? ? ? ? case 12345: if (s.equals("Hello")) { stmts1; break SUCCESS; } > else?if (s.equals(?Goodbye")) { stmts3; break SUCCESS; } else break > DEFAULT; Yes; the thing that pushed us to this translation was fallthrough and other weird control flow; by lowering the string-switch to an int-switch, the control structure is preserved, so any complex control flow comes along "for free" by existing int-switch translation.? Of course, it's not free; we pay with a pre-switch. (When we added strings in switch, it was part of "Project Coin", whose mandate was "small features", so it was preferable at the time to choose a simpler but less efficient desugaring.) > > >> #### Switches on enums >> >> Switches on `enum` constants exploit the fact that enums have >> (usually dense) integral ordinal values. Unfortunately, because an >> ordinal value can change between compilation time and runtime, we >> cannot rely on this mapping directly, but instead need to do an extra >> layer of mapping.? Given a switch like: >> >> ??? switch(color) { >> ??????? case RED: ... >> ??????? case GREEN: ... >> ??? } >> >> The compiler numbers the cases starting a 1 (as with string switch), >> and creates a synthetic class that maps the runtime values of the >> enum ordinals to the statically numbered cases: > > Inconsistency: in the string example above, you actually numbered the > cases 0 and 1, not 1 and 2. The old way, where the compiler generated the transform table (Java 5 and later) used 1-origin, for the reason you surmise.? The new, indy-based translation uses 0, like the String example. > > Presumably for this example the chosen integers start with 1 rather > than 0, so that if any element of the array is not explicitly > initialized by Outer$0, its default 0 value will not be confused with > an actual enum value. ?This subtle point should be mentioned explicitly. Yes, that's exactly why the historical approach did it this way. The new way (which is uniform with other indy-based switch types) takes care of this with pre-filling the array with the index that indicates "default" at linkage time.? From SwitchBootstraps::enumSwitch: ??????????? ordinalMap = new int[enumClass.getEnumConstants().length]; ??????????? Arrays.fill(ordinalMap, enumNames.length); ??????????? for (int i=0; i >> >> #### Explicit continue >> >> An alternative to exposing guards is to expose an explicit `continue` >> statement in switch, which would have the effect of "keep matching at >> the next case."? Then guards could be expressed imperatively as: >> >> ??? case P: >> ??????? if (!guard) >> ??????????? continue; >> ??????? ... >> ??????? break; >> ??? case Q: ? >> > A nice idea, but careful: it is already meaningful to write: > > while (?) { switch (?) { case 1: ? case 2: if (foo) continue; ? } } > > and expect the `continue` to start a new iteration of the `while` > loop. ?Indeed, this fact was already exploited above under ?### Guards?. Yes.? One of the downsides of exposing `continue` is that currently the (switch, continue) entry in my table from "Disallowing break label and continue label inside expression switch" has a P instead of an X, meaning that continue is currently allowed in a switch if there's an enclosing continue-able context.? So this could be disambiguated as you say with "continue switch", or with requiring a label in some or all circumstances. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Apr 6 18:38:25 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 6 Apr 2018 14:38:25 -0400 Subject: Switch translation In-Reply-To: <2b413b07-1c48-216c-a010-552d6b75f52f@oracle.com> References: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> <6FDE8CBD-14BA-4A18-9030-877E9C664194@oracle.com> <2b413b07-1c48-216c-a010-552d6b75f52f@oracle.com> Message-ID: > On Apr 6, 2018, at 1:58 PM, Brian Goetz wrote: > > >>> >>> int index=-1; >>> switch (s.hashCode()) { >>> case 12345: if (!s.equals("Hello")) break; index = 1; break; >>> case 6789: if (!s.equals("World")) break; index = 0; break; >>> default: index = -1; >>> } >>> switch (index) { >>> case 0: ... >>> case 1: ... >>> default: ... >>> } >>> >>> If there are hash collisions between the strings, the first switch must try all possible matching strings. >> >> I see why you use this structure, because it fits a general paradigm of first mapping to an integer. > > Or, "used", since this is the historical strategy which we're tossing over for the indy-based one. Sorry, I incorrectly interpreted some of the transitional text. > >> I now suggest that a post-optimization might then turn this into: >> >> SUCCESS: { >> DEFAULT: { >> switch (s.hashCode()) { >> case 12345: if (s.equals("Hello")) { stmts1; break SUCCESS; } else if (s.equals(?Goodbye")) { stmts3; break SUCCESS; } else break DEFAULT; > > Yes; the thing that pushed us to this translation was fallthrough and other weird control flow; by lowering the string-switch to an int-switch, the control structure is preserved, so any complex control flow comes along "for free" by existing int-switch translation. Of course, it's not free; we pay with a pre-switch. (When we added strings in switch, it was part of "Project Coin", whose mandate was "small features", so it was preferable at the time to choose a simpler but less efficient desugaring.) Oops, I forgot about preserving fallthrough. Yuck. ?Never mind." Well, the post-optimization can still be used in situations where no fallthrough occurs. Can decide later whether it is worth the trouble to avoid the integer encoding. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Apr 6 21:22:53 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 6 Apr 2018 23:22:53 +0200 (CEST) Subject: Switch translation In-Reply-To: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> References: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> Message-ID: <1592418474.1146193.1523049772989.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Vendredi 6 Avril 2018 17:51:49 > Objet: Switch translation > The following outlines our story for translating improved switches, including > both the switch improvements coming as part of JEP 325, and follow-on work to > add pattern matching to switches. Much of this has been discussed already over > the last year, but here it is in one place. > # Switch Translation > #### Maurizio Cimadamore and Brian Goetz > #### April 2018 > ## Part 1 -- constant switches > This part examines the current translation of `switch` constructs by `javac`, > and proposes a more general translation for switching on primitives, boxes, > strings, and enums, with the goals of: > - Unify the treatment of `switch` variants, simplifying the compiler > implementation and reducing the static footprint of generated code; > - Move responsibility for target classification from compile time to run time, > allowing us to more freely update the logic without updating the compiler. > ## Current translation > Switches on `int` (and the smaller integer primitives) are translated in one of > two ways. If the labels are relatively dense, we translate an `int` switch to a > `tableswitch`; if they are sparse, we translate to a `lookupswitch`. The > current heuristic appears to be that we use a `tableswitch` if it results in a > smaller bytecode than a `lookupswitch` (which uses twice as many bytes per > entry), which is a reasonable heuristic. > #### Switches on boxes > Switches on primitive boxes are currently implemented as if they were primitive > switches, unconditionally unboxing the target before entry (possibly throwing > NPE). > #### Switches on strings > Switches on strings are implemented as a two-step process, exploiting the fact > that strings cache their `hashCode()` and that hash codes are reasonably spread > out. Given a switch on strings like the one below: > switch (s) { > case "Hello": ... > case "World": ... > default: ... > } > The compiler desugar this into two separate switches, where the first switch > maps the input strings into a range of numbers [0..1], as shown below, which > can then be used in a subsequent plain switch on ints. The generated code > unconditionally calls `hashCode()`, again possibly throwing NPE. > int index=-1; > switch (s.hashCode()) { > case 12345: if (!s.equals("Hello")) break; index = 1; break; > case 6789: if (!s.equals("World")) break; index = 0; break; > default: index = -1; > } > switch (index) { > case 0: ... > case 1: ... > default: ... > } > If there are hash collisions between the strings, the first switch must try all > possible matching strings. > #### Switches on enums > Switches on `enum` constants exploit the fact that enums have (usually dense) > integral ordinal values. Unfortunately, because an ordinal value can change > between compilation time and runtime, we cannot rely on this mapping directly, > but instead need to do an extra layer of mapping. Given a switch like: > switch(color) { > case RED: ... > case GREEN: ... > } > The compiler numbers the cases starting a 1 (as with string switch), and creates > a synthetic class that maps the runtime values of the enum ordinals to the > statically numbered cases: > class Outer$0 { > synthetic final int[] $EnumMap$Color = new int[Color.values().length]; > static { > try { $EnumMap$Color[RED.ordinal()] = 1; } catch (NoSuchFieldError ex) {} > try { $EnumMap$Color[GREEN.ordinal()] = 2; } catch (NoSuchFieldError ex) {} > } > } > Then, the switch is translated as follows: > switch(Outer$0.$EnumMap$Color[color.ordinal()]) { > case 1: stmt1; > case 2: stmt2 > } > In other words, we construct an array whose size is the cardinality of the enum, > and then the element at position * i * of such array will contain the case > index corresponding to the enum constant with whose ordinal is * i * . > ## A more general scheme > The handling of strings and enums give us a hint of how to create a more regular > scheme; for `switch` targets more complex than `int`, we lower the `switch` to > an `int` switch with consecutive `case` labels, and use a separate process to > map the target into the range of synthetic case labels. > Now that we have `invokedynamic` in our toolbox, we can reduce all of the > non-`int` cases to a single form, where we number the cases with consecutive > integers, and perform case selection via an `invokedynamic`-based classifier > function, whose static argument list receives a description of the actual > targets, and which returns an `int` identifying what `case` to select. > This approach has several advantages: > - Reduced compiler complexity -- all switches follow a common pattern; > - Reduced static code size; > - The classification function can select from a wide range of strategies (linear > search, binary search, building a `HashMap`, constructing a perfect hash > function, etc), which can vary over time or from situation to situation; > - We are free to improve the strategy or select an alternate strategy (say, to > optimize for startup time) without having to recompile the code; > - Hopefully at least, if not more, JIT-friendly than the existing translation. > We can also use this approach in preference to `lookupswitch` for non-dense > `int` switches, as well as use it to extend `switch` to handle `long`, `float`, > and `double` targets (which were surely excluded in part because the JVM didn't > provide a convenient translation target for these types.) It seems to be a good general approach but it has several drawbacks: - do not work well with the type switch because the instanceof part (at least the part that recognizes the type) will be inside invokedynamic while the cast part will be in the tableswitch, so there is little chance that the VM can optimize such construction to avoid to do the instanceof/checkcast twice. - is not optimal in term of bytecode size with an expression switch that doesn't do any side effect on the local variable because there is a better representation where each case is desugared as a static method like for a lambda. In that case, you do not need a tableswitch, an invokedynamic is enough. In term of performance, because the VM used to did not gather profile when executing tableswitch/lookupswitch, performance were not good compared to only use an invokedynamic but JDK-8200303 may change things. So trying to detect if an invokedynamic alone is not enough can be interesting. In term of bootstrap method, most of the code can be shared apart returning an int or calling a static method on the leaf. > #### Bootstrap design > When designing the `invokedynamic` bootstraps to support this translation, we > face the classic lumping-vs-splitting decision. For now, we'll bias towards > splitting. In the following example, `BOOTSTRAP_PREAMBLE` indicates the usual > leading arguments for an indy bootstrap. We assume the compiler has numbered > the case values densely from 0..N, and the bootstrap will return [0,n) for > success, or N for "no match". > A strawman design might be: > // Numeric switches for P, accepts invocation as P -> I or Box(P) -> I > CallSite intSwitch(BOOTSTRAP_PREAMBLE, int... caseValues) > // Switch for String, invocation descriptor is String -> I > CallSite stringSwitch(BOOTSTRAP_PREAMBLE, String... caseValues) > // Switch for Enum, invocation descriptor is E -> I > CallSite enumSwitch(BOOTSTRAP_PREAMBLE, Class>> clazz, > String... caseNames) > It might be possible to encode all of these into a single bootstrap, but given > that the compiler already treats each type slightly differently, it seems there > is little value in this sort of lumping for non-pattern switches. > The `enumSwitch` bootstrap as proposed uses `String` values to describe the enum > constants, rather than encoding the enum constants directly via condy. This > allows us to be more robust to enums disappearing after compilation. > This strategy is also dependent on having broken the limitation on 253 bootstrap > arguments in indy/condy. > #### Extending to other primitive types > This approach extends naturally to other primitive types (long, double, float), > by the addition of some more bootstraps (which need to deal with the additional > complexities of infinity, NaN, etc): > CallSite longSwitch(BOOTSTRAP_PREAMBLE, long... caseValues) > CallSite floatSwitch(BOOTSTRAP_PREAMBLE, float... caseValues) > CallSite doubleSwitch(BOOTSTRAP_PREAMBLE, double... caseValues) > #### Extending to null > The scheme as proposed above does not explicitly handle nulls, which is a > feature we'd like to have in `switch`. There are a few ways we could add null > handling into the API: > - Split entry points into null-friendly or null-hostile switches; > - Find a way to encode nulls in the array of case values (which can be done with > condy); > - Always treat null as a possible input and a distinguished output, and have the > compiler ensure the switch can handle this distinguished output. > The last strategy is appealing and straightforward; assign a sentinel value (-1) > to `null`, and always return this sentinel when the input is null. The compiler > ensures that some case handles `null`, and if no case handles `null` then it > inserts an implicit > case -1: throw new NullPointerException(); > into the generated code. or - use a boolean as first bootstrap constant arguments to indicate if you want null to be -1 or a NPE. It will make the generated bytecode smaller and be sure that most of the time if there is no case null, the handling of null can be done implicitly by the VM. > #### General example > If we have a string switch: > switch (x) { > case "Foo": m(); break; > case "Bar": n(); // fall through > case "Baz": r(); break; > default: p(); > } > we translate into: > int t = indy[bsm=stringSwitch["Foo", "Bar", "Baz"]](x) > switch (t) { > case -1: throw new NullPointerException(); // implicit null case > case 0: m(); break; > case 1: n(); // fall through > case 2: r(); break; > case 3: p(); // default case > } with my proposed bootstrap recipe (use a boolen to indicate a nullcheck is needed of not), int t = indy[bsm=stringSwitch[false, "Foo", "Bar", "Baz"]](x) switch (t) { case 0: m(); break; case 1: n(); // fall through case 2: r(); break; case 3: p(); // default case } > All switches, with the exception of `int` switches (and maybe not even non-dense > `int` switches), follow this exact pattern. If the target type is not a > reference type, the `null` case is not needed. > This strategy is implemented in the `switch` branch of the amber repository; see > `java.lang.runtime.SwitchBootstraps` in that branch for (rough!) > implementations of the bootstraps. > ## Patterns in narrow-target switches > When we add patterns, we may encounter switches whose targets are tightly typed > (e.g., `String` or `int`) but still use some patterns in their expression. For > switches whose target type is a primitive, primitive box, `String`, or `enum`, > we'd like to use the optimized translation strategy outlined here, but the > following kinds of patterns might still show up in a switch on, say, `Integer`: > case var x: > case _: > case Integer x: > case Integer(var x): > The first three can be translated away by the source compiler, as they are > semantically equivalent to `default`. If any nontrivial patterns are present > (including deconstruction patterns), we may need to translate as a pattern > switch scheme -- see Part 2. (While the language may not distinguish between > "legacy" and "pattern" switches -- in that all switches are pattern switches -- > we'd like to avoid giving up obvious optimizations if we can.) > # Part 2 -- type test patterns and guards > A key motivation for reexamining switch translation is the impending arrival of > patterns in switch. We expect switch translation for the pattern case to follow > a similar structure -- lower to an `int` switch and use an indy-based > classifier to select an index. However, there are a few additional > complexities. One is that pattern cases may have guards, which means we need to > be able to re-enter the bootstrap with an indication to "continue matching from > case N", in the event of a failed guard. (Even if the language doesn't support > guards directly, the obvious implementation strategy for nested patterns is to > desugar them into guards.) > Translating pattern switches is more complicated because there are more options > for how to divide the work between the statically generated code and the switch > classifier, and different choices have different performance side-effects (are > binding variables "boxed" into a tuple to be returned, or do they need to be > redundantly calculated). I'm still not sure that having guards make sense from the language perspective, i find a switch with guard to be less readable that a switch with an if (at least in Scala). > ## Type-test patterns > Type-test patterns are notable because their applicability predicate is purely > based on the type system, meaning that the compiler can directly reason about > it both statically (using flow analysis, optimizing away dynamic type tests) > and dynamically (with `instanceof`.) A switch involving type-tests: > switch (x) { > case String s: ... > case Integer i: ... > case Long l: ... > } > can (among other strategies) be translated into a chain of `if-else` using > `instanceof` and casts: > if (x instanceof String) { String s = (String) x; ... } > else if (x instanceof Integer) { Integer i = (Integer) x; ... } > else if (x instanceof Long) { Long l = (Long) x; ... } > #### Guards > The `if-else` desugaring can also naturally handle guards: > switch (x) { > case String s > where (s.length() > 0): ... > case Integer i > where (i > 0): ... > case Long l > where (l > 0L): ... > } > can be translated to: > if (x instanceof String > && ((String) x).length() > 0) { String s = (String) x; ... } > else if (x instanceof Integer > && ((Integer) x) > 0) { Integer i = (Integer) x; ... } > else if (x instanceof Long > && ((Long) x) > 0L) { Long l = (Long) x; ... } > #### Performance concerns > The translation to `if-else` chains is simple (for switches without > fallthrough), but is harder for the VM to optimize, because we've used a more > general control flow mechanism. If the target is an empty `String`, which means > we'd pass the first `instanceof` but fail the guard, class-hierarchy analysis > could tell us that it can't possibly be an `Integer` or a `Long`, and so > there's no need to perform those tests. But generating code that takes > advantage of this information is more complex. it's worst than that, it's not a if-else chain it's a if-instanceof-else chain, instanceof by itself is decomposed into several ifs, so when you have enough (the number depend on your CPU) if-instanceof-else because the assembly code if full of conditional branches, it will be really slow but branch predictor will be lost. > In the extreme case, where a switch consists entirely of type test patterns for > final classes, this could be performed as an O(1) operation by hashing. And > this is a common case involving switches over alternatives in a sum (sealed) > type. (We shouldn't rely on finality at compile time, as this can change > between compile and run time, but we should take advantage of this at run time > if we can.) Hashing is complex without VM support because you have to be able to update dynamically the cache and to not have strong pointers to the classes otherwise you wil not be able to unload the classes. So while the complexity is O(1) it may requires several loads making hashing only useful when there is quite a few cases. > Finally, the straightforward static translation may miss opportunities for > optimization. For example: > switch (x) { > case Point p > where p.x > 0 && p.y > 0: A > case Point p > where p.x > 0 && p.y == 0: B > } > Here, not only would we potentially test the target twice to see if it is a > `Point`, but we then further extract the `x` component twice and perform the > `p.x > 0` test twice. > #### Optimization opportunities > The compiler can eliminate some redundant calculations through straightforward > techniques. The previous switch can be transformed to: > switch (x) { > case Point p: > if (((Point) p).x > 0 && ((Point) p).y > 0) { A } > else if (((Point) p).x > 0 && ((Point) p).y > 0) { B } > to eliminate the redundant `instanceof` (and admits further CSE optimizations.) > #### Clause reordering > The above example was easy to transform because the two `case Point` clauses > were adjacent. But what if they are not? In some cases, it is safe to reorder > them. For types `T` and `U`, it is safe to reorder `case T` and `case U` if the > two types have no intersection; that there can be no types that are subtypes of > them both. This is true when `T` and `U` are classes and neither extends the > other, or when one is a final class and the other is an interface that the > class does not implement. > The compiler could then reorder case clauses so that all the ones whose first > test is `case Point` are adjacent, and then coalesce them all into a single arm > of the `if-else` chain. > A possible spoiler here is fallthrough; if case A falls into case B, then cases > A and B have to be moved as a group. (This is another reason to consider > limiting fallthrough.) > A bigger possible spoiler here is separate compilation. If at compile time, we > see that `T` and `U` are disjoint types, do we want to bake that assumption > into the compilation, or do we have to re-check that assumption at runtime? > #### Summary of if-else translation > While the if-else translation at first looks pretty bad, we are able to extract > a fair amount of redundancy through well-understood compiler transformations. > If an N-way switch has only M distinct types in it, in most cases we can reduce > the cost from _O(N)_ to _O(M)_. Sometimes _M == N_, so this doesn't help, but > sometimes _ M << N _ (and sometimes `N` is small, in which case _O(N)_ is > fine.) > Reordering clauses involves some risk; specifically, that the class hierarchy > will change between compile and run time. It seems eminently safe to reorder > `String` and `Integer`, but more questionable to reorder an arbitrary class > `Foo` with `Runnable`, even if `Foo` doesn't implement `Runnable` now, because > it might easily be changed to do so later. Ideally we'd like to perform > class-hierarchy optimizations using the runtime hierarchy, not the compile-time > hierarchy. > ## Type classifiers > The technique outlined in _Part 1_, where we lower the complex switch to a dense > `int` switch, and use an indy-based classifier to select an index, is > applicable here as well. First let's consider a switch consisting only of > unguarded type-test patterns, optionally with a default clause. > We'll start with an `indy` bootstrap whose static argument are `Class` constants > corresponding to each arm of the switch, whose dynamic argument is the switch > target, and whose return value is a case number (or distinguished sentinels for > "no match" and `null`.) We can easily implement such a bootstrap with a linear > search, but can also do better; if some subset of the classes are `final`, we > can choose between these more quickly (such as via binary search on > `hashCode()`, hash function, or hash table), and we need perform only a single > operation to test all of those at once. Dynamic techniques (such as a building > a hash map of previously seen target types), which `indy` is well-suited to, > can asymptotically approach _O(1)_ even when the classes involved are not > final. > So we can lower: > switch (x) { > case T t: A > case U u: B > case V v: C > } > to > int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x) > switch (y) { > case 0: A > case 1: B > case 2: C > } > This has the advantages that the generated code is very similar to the source > code, we can (in some cases) get _O(1)_ dispatch performance, and we can handle > fallthrough with no additional complexity. as i said above, you need to do the checkcast twice in that case. > #### Guards > There are two approaches we could take to add support for guards into the > process; we could try to teach the bootstrap about guards (and would have to > pass locals that appear in guard expressions as additional arguments to the > classifier), or we could leave guards to the generated bytecode. The latter > seems far more attractive, but requires some tweaks to the bootstrap arguments > and to the shape of the generated code. > If the classifier says "you have matched case #3", but then we fail the guard > for #3, we want to go back into the classifier and start again at #4. > (Sometimes the classifier can also use this information ("start over at #4") to > optimize away unnecessary tests.) > We add a second argument (where to start) to the classifier invocation > signature, and wrap the switch in a loop, lowering: > switch (target) { > case T t where (e1): A > case T t where (e2): B > case U u where (e3): C > } > into > int index = -1; // start at the top > while (true) { > index = indy[...](target, index) > switch (index) { > case 0: if (!e1) continue; A > case 1: if (!e2) continue; B > case 2: if (!e3) continue; C > default: break; > } > break; > } > For cases where the same type test is repeated in consecutive positions (at N > and N+1), we can have the static compiler coalesce them as above, or we could > have the bootstrap maintain a table so that if you re-enter the bootstrap where > the previous answer was N, then it can immediately return N+1. Similarly, if N > and N+1 are known to be mutually exclusive types (like `String` and `Integer`), > on reentering the classifier with N, we can skip right to N+2 since if we > matched `String`, we cannot match `Integer`. Lookup tables for such > optimizations can be built at callsite linkage time. > #### Mixing constants and type tests > This approach also extends to tests that are a mix of constant patterns and > type-test patterns, such as: > switch (x) { > case "Foo": ... > case 0L: ... > case Integer i: > } > We can extend the bootstrap protocol to accept constants as well as types, and > it is a straightforward optimization to combine both type matching and constant > matching in a single pass. > ## Nested patterns > Nested patterns are essentially guards; even if we don't expose guards in the > language, we can desugar > case Point(0, var x): > into the equivalent of > case Point(var a, var x) && a matches 0: > using the same translation story as above -- use the classifier to select a > candidate case arm based on the top-type of the pattern, and then do additional > checks in the generated bytecode, and if the checks fail, continue and re-enter > the classifier starting at the next case. > #### Explicit continue > An alternative to exposing guards is to expose an explicit `continue` statement > in switch, which would have the effect of "keep matching at the next case." > Then guards could be expressed imperatively as: > case P: > if (!guard) > continue; > ... > break; > case Q: ... R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Apr 7 15:39:09 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 7 Apr 2018 11:39:09 -0400 Subject: Feedback wanted: switch expression typing In-Reply-To: References: <350254623.2244204.1522428861283.JavaMail.zimbra@u-pem.fr> <17865431.2309562.1522491830602.JavaMail.zimbra@u-pem.fr> Message-ID: <99E5310B-D197-4E44-89C0-B39FA8F8D672@oracle.com> We can start on later now. We can warn on conditionals where the two would give a different answer, nudging people to fix their code, and then bring them into alignment after everyone has been suitably irritated by warnings. > On Mar 31, 2018, at 7:13 AM, Doug Lea
wrote: > > On Sat, March 31, 2018 6:23 am, forax at univ-mlv.fr wrote: >> The fact that the semantics of ?: is very ad-hoc is a kind of accident of >> the history, >> we may want to fix it but i do not see why we have to fix it at the same >> time that we introduce the expression switch, >> we can fix the semantics of ?: later or never. > > Where "later" probably means "never". It should be fixed now. > I agree that (B) and (C) are basically the same, so choose (C). > I've had to fiddle with :? to get the compiler to shut up about > reasonable-looking expressions. (Sorry, I can't recall examples.) > Having the same story for both of them would be best, assuming > that existing code doesn't break. > > -Doug > >> >> R?mi >> >> ----- Mail original ----- >>> De: "daniel smith" >>> ?: "Remi Forax" >>> Cc: "amber-spec-experts" >>> Envoy?: Samedi 31 Mars 2018 03:44:49 >>> Objet: Re: Feedback wanted: switch expression typing >> >>>> On Mar 30, 2018, at 10:54 AM, Remi Forax wrote: >>>> >>>> I do not see (B) as sacrifying the consistency because the premise is >>>> that an >>>> expression switch should be consistent with ?: >>>> >>>> But an expression switch can also be modeled as a classical switch that >>>> returns >>>> it's value to a local variable. >>>> >>>> int a = switch(foo) { >>>> case 'a' -> 2; >>>> case 'b' -> 3; >>>> } >>>> can be see as >>>> int a = $switch(foo); >>>> with >>>> int $switch(char foo) { >>>> case 'a': return 2; >>>> case 'b': return 3; >>>> } >>> >>> I mean, sure, this is another way to assert "switches in assignment >>> contexts >>> should always be poly expressions". >>> >>> But it's just as easy to assert "conditional expressions in assignment >>> contexts >>> should always be poly expressions". >>> >>> int a = test ? 2 : 3; >>> can be seen as >>> int a = $conditional(test); >>> with >>> int $conditional(boolean test) { >>> if (test) return 2; >>> else return 3; >>> } >>> >>> Those are probably good principles. But if we embrace them, we're doing >>> (C). >>> >>> ?Dan >> > > From amaembo at gmail.com Sun Apr 8 12:09:46 2018 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 08 Apr 2018 12:09:46 +0000 Subject: Switch translation In-Reply-To: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> References: <3de80623-baf4-e8b1-f58c-2e3e52c52b2a@oracle.com> Message-ID: Hello! > A possible spoiler here is fallthrough; if case A falls into case B, then cases A and B have to be moved as a group. (This is another reason to consider limiting fallthrough.) I don't think it's a big problem. If we first just need to determine an index to be passed to the tableswitch, then only the final tableswitch will have a fallthrough, while the index determination procedure never need a fallthrough. Thus during the index determination we are free to reorder branches along with the index values. With best regards, Tagir Valeev. 6 ???. 2018 ?. 22:52 ???????????? "Brian Goetz" ???????: The following outlines our story for translating improved switches, including both the switch improvements coming as part of JEP 325, and follow-on work to add pattern matching to switches. Much of this has been discussed already over the last year, but here it is in one place. # Switch Translation #### Maurizio Cimadamore and Brian Goetz #### April 2018 ## Part 1 -- constant switches This part examines the current translation of `switch` constructs by `javac`, and proposes a more general translation for switching on primitives, boxes, strings, and enums, with the goals of: - Unify the treatment of `switch` variants, simplifying the compiler implementation and reducing the static footprint of generated code; - Move responsibility for target classification from compile time to run time, allowing us to more freely update the logic without updating the compiler. ## Current translation Switches on `int` (and the smaller integer primitives) are translated in one of two ways. If the labels are relatively dense, we translate an `int` switch to a `tableswitch`; if they are sparse, we translate to a `lookupswitch`. The current heuristic appears to be that we use a `tableswitch` if it results in a smaller bytecode than a `lookupswitch` (which uses twice as many bytes per entry), which is a reasonable heuristic. #### Switches on boxes Switches on primitive boxes are currently implemented as if they were primitive switches, unconditionally unboxing the target before entry (possibly throwing NPE). #### Switches on strings Switches on strings are implemented as a two-step process, exploiting the fact that strings cache their `hashCode()` and that hash codes are reasonably spread out. Given a switch on strings like the one below: switch (s) { case "Hello": ... case "World": ... default: ... } The compiler desugar this into two separate switches, where the first switch maps the input strings into a range of numbers [0..1], as shown below, which can then be used in a subsequent plain switch on ints. The generated code unconditionally calls `hashCode()`, again possibly throwing NPE. int index=-1; switch (s.hashCode()) { case 12345: if (!s.equals("Hello")) break; index = 1; break; case 6789: if (!s.equals("World")) break; index = 0; break; default: index = -1; } switch (index) { case 0: ... case 1: ... default: ... } If there are hash collisions between the strings, the first switch must try all possible matching strings. #### Switches on enums Switches on `enum` constants exploit the fact that enums have (usually dense) integral ordinal values. Unfortunately, because an ordinal value can change between compilation time and runtime, we cannot rely on this mapping directly, but instead need to do an extra layer of mapping. Given a switch like: switch(color) { case RED: ... case GREEN: ... } The compiler numbers the cases starting a 1 (as with string switch), and creates a synthetic class that maps the runtime values of the enum ordinals to the statically numbered cases: class Outer$0 { synthetic final int[] $EnumMap$Color = new int[Color.values().length]; static { try { $EnumMap$Color[RED.ordinal()] = 1; } catch (NoSuchFieldError ex) {} try { $EnumMap$Color[GREEN.ordinal()] = 2; } catch (NoSuchFieldError ex) {} } } Then, the switch is translated as follows: switch(Outer$0.$EnumMap$Color[color.ordinal()]) { case 1: stmt1; case 2: stmt2 } In other words, we construct an array whose size is the cardinality of the enum, and then the element at position **i** of such array will contain the case index corresponding to the enum constant with whose ordinal is **i**. ## A more general scheme The handling of strings and enums give us a hint of how to create a more regular scheme; for `switch` targets more complex than `int`, we lower the `switch` to an `int` switch with consecutive `case` labels, and use a separate process to map the target into the range of synthetic case labels. Now that we have `invokedynamic` in our toolbox, we can reduce all of the non-`int` cases to a single form, where we number the cases with consecutive integers, and perform case selection via an `invokedynamic`-based classifier function, whose static argument list receives a description of the actual targets, and which returns an `int` identifying what `case` to select. This approach has several advantages: - Reduced compiler complexity -- all switches follow a common pattern; - Reduced static code size; - The classification function can select from a wide range of strategies (linear search, binary search, building a `HashMap`, constructing a perfect hash function, etc), which can vary over time or from situation to situation; - We are free to improve the strategy or select an alternate strategy (say, to optimize for startup time) without having to recompile the code; - Hopefully at least, if not more, JIT-friendly than the existing translation. We can also use this approach in preference to `lookupswitch` for non-dense `int` switches, as well as use it to extend `switch` to handle `long`, `float`, and `double` targets (which were surely excluded in part because the JVM didn't provide a convenient translation target for these types.) #### Bootstrap design When designing the `invokedynamic` bootstraps to support this translation, we face the classic lumping-vs-splitting decision. For now, we'll bias towards splitting. In the following example, `BOOTSTRAP_PREAMBLE` indicates the usual leading arguments for an indy bootstrap. We assume the compiler has numbered the case values densely from 0..N, and the bootstrap will return [0,n) for success, or N for "no match". A strawman design might be: // Numeric switches for P, accepts invocation as P -> I or Box(P) -> I CallSite intSwitch(BOOTSTRAP_PREAMBLE, int... caseValues) // Switch for String, invocation descriptor is String -> I CallSite stringSwitch(BOOTSTRAP_PREAMBLE, String... caseValues) // Switch for Enum, invocation descriptor is E -> I CallSite enumSwitch(BOOTSTRAP_PREAMBLE, Class>> clazz, String... caseNames) It might be possible to encode all of these into a single bootstrap, but given that the compiler already treats each type slightly differently, it seems there is little value in this sort of lumping for non-pattern switches. The `enumSwitch` bootstrap as proposed uses `String` values to describe the enum constants, rather than encoding the enum constants directly via condy. This allows us to be more robust to enums disappearing after compilation. This strategy is also dependent on having broken the limitation on 253 bootstrap arguments in indy/condy. #### Extending to other primitive types This approach extends naturally to other primitive types (long, double, float), by the addition of some more bootstraps (which need to deal with the additional complexities of infinity, NaN, etc): CallSite longSwitch(BOOTSTRAP_PREAMBLE, long... caseValues) CallSite floatSwitch(BOOTSTRAP_PREAMBLE, float... caseValues) CallSite doubleSwitch(BOOTSTRAP_PREAMBLE, double... caseValues) #### Extending to null The scheme as proposed above does not explicitly handle nulls, which is a feature we'd like to have in `switch`. There are a few ways we could add null handling into the API: - Split entry points into null-friendly or null-hostile switches; - Find a way to encode nulls in the array of case values (which can be done with condy); - Always treat null as a possible input and a distinguished output, and have the compiler ensure the switch can handle this distinguished output. The last strategy is appealing and straightforward; assign a sentinel value (-1) to `null`, and always return this sentinel when the input is null. The compiler ensures that some case handles `null`, and if no case handles `null` then it inserts an implicit case -1: throw new NullPointerException(); into the generated code. #### General example If we have a string switch: switch (x) { case "Foo": m(); break; case "Bar": n(); // fall through case "Baz": r(); break; default: p(); } we translate into: int t = indy[bsm=stringSwitch["Foo", "Bar", "Baz"]](x) switch (t) { case -1: throw new NullPointerException(); // implicit null case case 0: m(); break; case 1: n(); // fall through case 2: r(); break; case 3: p(); // default case } All switches, with the exception of `int` switches (and maybe not even non-dense `int` switches), follow this exact pattern. If the target type is not a reference type, the `null` case is not needed. This strategy is implemented in the `switch` branch of the amber repository; see `java.lang.runtime.SwitchBootstraps` in that branch for (rough!) implementations of the bootstraps. ## Patterns in narrow-target switches When we add patterns, we may encounter switches whose targets are tightly typed (e.g., `String` or `int`) but still use some patterns in their expression. For switches whose target type is a primitive, primitive box, `String`, or `enum`, we'd like to use the optimized translation strategy outlined here, but the following kinds of patterns might still show up in a switch on, say, `Integer`: case var x: case _: case Integer x: case Integer(var x): The first three can be translated away by the source compiler, as they are semantically equivalent to `default`. If any nontrivial patterns are present (including deconstruction patterns), we may need to translate as a pattern switch scheme -- see Part 2. (While the language may not distinguish between "legacy" and "pattern" switches -- in that all switches are pattern switches -- we'd like to avoid giving up obvious optimizations if we can.) # Part 2 -- type test patterns and guards A key motivation for reexamining switch translation is the impending arrival of patterns in switch. We expect switch translation for the pattern case to follow a similar structure -- lower to an `int` switch and use an indy-based classifier to select an index. However, there are a few additional complexities. One is that pattern cases may have guards, which means we need to be able to re-enter the bootstrap with an indication to "continue matching from case N", in the event of a failed guard. (Even if the language doesn't support guards directly, the obvious implementation strategy for nested patterns is to desugar them into guards.) Translating pattern switches is more complicated because there are more options for how to divide the work between the statically generated code and the switch classifier, and different choices have different performance side-effects (are binding variables "boxed" into a tuple to be returned, or do they need to be redundantly calculated). ## Type-test patterns Type-test patterns are notable because their applicability predicate is purely based on the type system, meaning that the compiler can directly reason about it both statically (using flow analysis, optimizing away dynamic type tests) and dynamically (with `instanceof`.) A switch involving type-tests: switch (x) { case String s: ... case Integer i: ... case Long l: ... } can (among other strategies) be translated into a chain of `if-else` using `instanceof` and casts: if (x instanceof String) { String s = (String) x; ... } else if (x instanceof Integer) { Integer i = (Integer) x; ... } else if (x instanceof Long) { Long l = (Long) x; ... } #### Guards The `if-else` desugaring can also naturally handle guards: switch (x) { case String s where (s.length() > 0): ... case Integer i where (i > 0): ... case Long l where (l > 0L): ... } can be translated to: if (x instanceof String && ((String) x).length() > 0) { String s = (String) x; ... } else if (x instanceof Integer && ((Integer) x) > 0) { Integer i = (Integer) x; ... } else if (x instanceof Long && ((Long) x) > 0L) { Long l = (Long) x; ... } #### Performance concerns The translation to `if-else` chains is simple (for switches without fallthrough), but is harder for the VM to optimize, because we've used a more general control flow mechanism. If the target is an empty `String`, which means we'd pass the first `instanceof` but fail the guard, class-hierarchy analysis could tell us that it can't possibly be an `Integer` or a `Long`, and so there's no need to perform those tests. But generating code that takes advantage of this information is more complex. In the extreme case, where a switch consists entirely of type test patterns for final classes, this could be performed as an O(1) operation by hashing. And this is a common case involving switches over alternatives in a sum (sealed) type. (We shouldn't rely on finality at compile time, as this can change between compile and run time, but we should take advantage of this at run time if we can.) Finally, the straightforward static translation may miss opportunities for optimization. For example: switch (x) { case Point p where p.x > 0 && p.y > 0: A case Point p where p.x > 0 && p.y == 0: B } Here, not only would we potentially test the target twice to see if it is a `Point`, but we then further extract the `x` component twice and perform the `p.x > 0` test twice. #### Optimization opportunities The compiler can eliminate some redundant calculations through straightforward techniques. The previous switch can be transformed to: switch (x) { case Point p: if (((Point) p).x > 0 && ((Point) p).y > 0) { A } else if (((Point) p).x > 0 && ((Point) p).y > 0) { B } to eliminate the redundant `instanceof` (and admits further CSE optimizations.) #### Clause reordering The above example was easy to transform because the two `case Point` clauses were adjacent. But what if they are not? In some cases, it is safe to reorder them. For types `T` and `U`, it is safe to reorder `case T` and `case U` if the two types have no intersection; that there can be no types that are subtypes of them both. This is true when `T` and `U` are classes and neither extends the other, or when one is a final class and the other is an interface that the class does not implement. The compiler could then reorder case clauses so that all the ones whose first test is `case Point` are adjacent, and then coalesce them all into a single arm of the `if-else` chain. A possible spoiler here is fallthrough; if case A falls into case B, then cases A and B have to be moved as a group. (This is another reason to consider limiting fallthrough.) A bigger possible spoiler here is separate compilation. If at compile time, we see that `T` and `U` are disjoint types, do we want to bake that assumption into the compilation, or do we have to re-check that assumption at runtime? #### Summary of if-else translation While the if-else translation at first looks pretty bad, we are able to extract a fair amount of redundancy through well-understood compiler transformations. If an N-way switch has only M distinct types in it, in most cases we can reduce the cost from _O(N)_ to _O(M)_. Sometimes _M == N_, so this doesn't help, but sometimes _M << N_ (and sometimes `N` is small, in which case _O(N)_ is fine.) Reordering clauses involves some risk; specifically, that the class hierarchy will change between compile and run time. It seems eminently safe to reorder `String` and `Integer`, but more questionable to reorder an arbitrary class `Foo` with `Runnable`, even if `Foo` doesn't implement `Runnable` now, because it might easily be changed to do so later. Ideally we'd like to perform class-hierarchy optimizations using the runtime hierarchy, not the compile-time hierarchy. ## Type classifiers The technique outlined in _Part 1_, where we lower the complex switch to a dense `int` switch, and use an indy-based classifier to select an index, is applicable here as well. First let's consider a switch consisting only of unguarded type-test patterns, optionally with a default clause. We'll start with an `indy` bootstrap whose static argument are `Class` constants corresponding to each arm of the switch, whose dynamic argument is the switch target, and whose return value is a case number (or distinguished sentinels for "no match" and `null`.) We can easily implement such a bootstrap with a linear search, but can also do better; if some subset of the classes are `final`, we can choose between these more quickly (such as via binary search on `hashCode()`, hash function, or hash table), and we need perform only a single operation to test all of those at once. Dynamic techniques (such as a building a hash map of previously seen target types), which `indy` is well-suited to, can asymptotically approach _O(1)_ even when the classes involved are not final. So we can lower: switch (x) { case T t: A case U u: B case V v: C } to int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x) switch (y) { case 0: A case 1: B case 2: C } This has the advantages that the generated code is very similar to the source code, we can (in some cases) get _O(1)_ dispatch performance, and we can handle fallthrough with no additional complexity. #### Guards There are two approaches we could take to add support for guards into the process; we could try to teach the bootstrap about guards (and would have to pass locals that appear in guard expressions as additional arguments to the classifier), or we could leave guards to the generated bytecode. The latter seems far more attractive, but requires some tweaks to the bootstrap arguments and to the shape of the generated code. If the classifier says "you have matched case #3", but then we fail the guard for #3, we want to go back into the classifier and start again at #4. (Sometimes the classifier can also use this information ("start over at #4") to optimize away unnecessary tests.) We add a second argument (where to start) to the classifier invocation signature, and wrap the switch in a loop, lowering: switch (target) { case T t where (e1): A case T t where (e2): B case U u where (e3): C } into int index = -1; // start at the top while (true) { index = indy[...](target, index) switch (index) { case 0: if (!e1) continue; A case 1: if (!e2) continue; B case 2: if (!e3) continue; C default: break; } break; } For cases where the same type test is repeated in consecutive positions (at N and N+1), we can have the static compiler coalesce them as above, or we could have the bootstrap maintain a table so that if you re-enter the bootstrap where the previous answer was N, then it can immediately return N+1. Similarly, if N and N+1 are known to be mutually exclusive types (like `String` and `Integer`), on reentering the classifier with N, we can skip right to N+2 since if we matched `String`, we cannot match `Integer`. Lookup tables for such optimizations can be built at callsite linkage time. #### Mixing constants and type tests This approach also extends to tests that are a mix of constant patterns and type-test patterns, such as: switch (x) { case "Foo": ... case 0L: ... case Integer i: } We can extend the bootstrap protocol to accept constants as well as types, and it is a straightforward optimization to combine both type matching and constant matching in a single pass. ## Nested patterns Nested patterns are essentially guards; even if we don't expose guards in the language, we can desugar case Point(0, var x): into the equivalent of case Point(var a, var x) && a matches 0: using the same translation story as above -- use the classifier to select a candidate case arm based on the top-type of the pattern, and then do additional checks in the generated bytecode, and if the checks fail, continue and re-enter the classifier starting at the next case. #### Explicit continue An alternative to exposing guards is to expose an explicit `continue` statement in switch, which would have the effect of "keep matching at the next case." Then guards could be expressed imperatively as: case P: if (!guard) continue; ... break; case Q: ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Apr 8 20:16:54 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 8 Apr 2018 22:16:54 +0200 (CEST) Subject: Record and annotation values Message-ID: <594033480.1352711.1523218614316.JavaMail.zimbra@u-pem.fr> Currently annotation values are limited to what is encodable as a constant in the constant pool. With Condy, we can expand the number of values that can be encodable as a constant in the constant pool to the infinity by allowing a reference to any non-mutable class to be encoded as an annotation values. For that we need to have a 'protocol' that - encode an instance of a user defined non-mutable class as a condy by the compiler. - decode an instance of a user defined non-mutable class by the JDK runtime. Records with their constructors do not provide enough meta-information for that, the parameter names of the constructors may not be available at runtime. So i think the constructors parameter names of a Record should be always recorded (as with --parameters was specified for the constructors) to enable non-mutable records to be annotation values. R?mi From brian.goetz at oracle.com Sun Apr 8 21:33:33 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 8 Apr 2018 17:33:33 -0400 Subject: Record and annotation values In-Reply-To: <594033480.1352711.1523218614316.JavaMail.zimbra@u-pem.fr> References: <594033480.1352711.1523218614316.JavaMail.zimbra@u-pem.fr> Message-ID: I think this is one in the category of "just because you can, doesn't meant you should."? So before discussing mechanisms, let's discuss goals. Annotations are for metaDATA.? The restriction on what you can put in an annotation stems in part from what you can put in the constant pool, but busting the constant pool limits doesn't automatically mean its a good idea to make it an annotation value. The rule that I've been gravitating towards is: anything that is important enough to have a _literal form_ in the language, probably is is a good candidate to consider as an annotation value.? The obvious first candidate is method refs (excluding instance-bound ones).? But even with mrefs in, I'd say no to lambdas, because annotation values should also be scrutable to annotation processors.? (If we had collection literals, then collections of things that can already go in annos is probably also a valid candidate.) Don't forget that just because a record has all-final fields, doesn't mean its immutable all the way down.? And while some notion of "immutable all the way down" has been a frequent wish-list item, taking that on just so you can stash records in annotations is definitely tail wagging the dog. On the second point (reifying parameter names), while I don't object to doing this, this can't be the "official" way to get this data.? It should be reflectively accessible, because you need to nominally tie the constructor arguments to some way of getting their values (getters or fields).? The compiler prototype uses an annotation on the class declaration that stashes this, but that's just a prototyping hack; this probably needs a RecordSignature attribute. On 4/8/2018 4:16 PM, Remi Forax wrote: > Currently annotation values are limited to what is encodable as a constant in the constant pool. > With Condy, we can expand the number of values that can be encodable as a constant in the constant pool to the infinity by allowing a reference to any non-mutable class to be encoded as an annotation values. > > For that we need to have a 'protocol' that > - encode an instance of a user defined non-mutable class as a condy by the compiler. > - decode an instance of a user defined non-mutable class by the JDK runtime. > > Records with their constructors do not provide enough meta-information for that, the parameter names of the constructors may not be available at runtime. > > So i think the constructors parameter names of a Record should be always recorded (as with --parameters was specified for the constructors) to enable non-mutable records to be annotation values. > > R?mi > From amaembo at gmail.com Mon Apr 9 05:07:05 2018 From: amaembo at gmail.com (Tagir Valeev) Date: Mon, 9 Apr 2018 12:07:05 +0700 Subject: Switch on java.lang.Class Message-ID: Hello! I don't remember whether switch on java.lang.Class instance was discussed. I guess, this pattern is quite common and it will be useful to support it. Such code often appears in deserialization logic when we branch on desired type to deserialize. Here are a couple of examples from opensource libraries: 1. com.google.gson.DefaultDateTypeAdapter#read (gson-2.8.2): Date date = deserializeToDate(in.nextString()); if (dateType == Date.class) { return date; } else if (dateType == Timestamp.class) { return new Timestamp(date.getTime()); } else if (dateType == java.sql.Date.class) { return new java.sql.Date(date.getTime()); } else { // This must never happen: dateType is guarded in the primary constructor throw new AssertionError(); } Could be rewritten as: Date date = deserializeToDate(in.nextString()); return switch(dateType) { case Date.class -> date; case Timestamp.class -> new Timestamp(date.getTime()); case java.sql.Date.class -> new java.sql.Date(date.getTime()); default -> // This must never happen: dateType is guarded in the primary constructor throw new AssertionError(); }; 2. com.fasterxml.jackson.databind.deser.std.FromStringDeserializer#findDeserializer (jackson-databind-2.9.4): public static Std findDeserializer(Class rawType) { int kind = 0; if (rawType == File.class) { kind = Std.STD_FILE; } else if (rawType == URL.class) { kind = Std.STD_URL; } else if (rawType == URI.class) { kind = Std.STD_URI; } else if (rawType == Class.class) { kind = Std.STD_CLASS; } else if (rawType == JavaType.class) { kind = Std.STD_JAVA_TYPE; } else if // more branches like this } else { return null; } return new Std(rawType, kind); } Could be rewritten as: public static Std findDeserializer(Class rawType) { int kind = switch(rawType) { case File.class -> Std.STD_FILE; case URL.class -> Std.STD_URL; case URI.class -> Std.STD_URI; case Class.cass -> Std.STD_CLASS; case JavaType.class -> Std.STD_JAVA_TYPE; ... default -> 0; }; return kind == 0 ? null : new Std(rawType, kind); } In such code all branches are mutually exclusive. The bootstrap method can generate a lookupswitch based on Class.hashCode, then equals checks, pretty similar to String switch implementation. Unlike String hash codes Class.hashCode is not stable and varies between JVM launches, but they are already known during the bootstrap and we can trust them during the VM lifetime, so we can generate a lookupswitch. The minor problematic point is to support primitive classes like int.class. This cannot be passed directly as indy static argument, but this can be solved with condy. What do you think? With best regards, Tagir Valeev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Apr 9 06:16:26 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 9 Apr 2018 08:16:26 +0200 (CEST) Subject: Record and annotation values In-Reply-To: References: <594033480.1352711.1523218614316.JavaMail.zimbra@u-pem.fr> Message-ID: <487962183.23171.1523254586643.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > Envoy?: Dimanche 8 Avril 2018 23:33:33 > Objet: Re: Record and annotation values > I think this is one in the category of "just because you can, doesn't > meant you should."? So before discussing mechanisms, let's discuss goals. > > Annotations are for metaDATA.? The restriction on what you can put in an > annotation stems in part from what you can put in the constant pool, but > busting the constant pool limits doesn't automatically mean its a good > idea to make it an annotation value. > > The rule that I've been gravitating towards is: anything that is > important enough to have a _literal form_ in the language, probably is > is a good candidate to consider as an annotation value.? The obvious > first candidate is method refs (excluding instance-bound ones).? But > even with mrefs in, I'd say no to lambdas, because annotation values > should also be scrutable to annotation processors.? (If we had > collection literals, then collections of things that can already go in > annos is probably also a valid candidate.) > > Don't forget that just because a record has all-final fields, doesn't > mean its immutable all the way down.? And while some notion of > "immutable all the way down" has been a frequent wish-list item, taking > that on just so you can stash records in annotations is definitely tail > wagging the dog. > I agree, i do not think it's a good idea to introduce record as annotation value just because we can but that's not the reason why i think we should introduce record as annotation value. There is a lot of time where you can construct an annotation with invalid values but those invalid annotation will not be catch at compile time but at runtime. By example, @Test({ignore=false, timeout_value=-3, timeout_unit=SECOND}) So having a way to specify a contract for annotation values will make Java more safe. > > On the second point (reifying parameter names), while I don't object to > doing this, this can't be the "official" way to get this data.? It > should be reflectively accessible, because you need to nominally tie the > constructor arguments to some way of getting their values (getters or > fields).? The compiler prototype uses an annotation on the class > declaration that stashes this, but that's just a prototyping hack; this > probably needs a RecordSignature attribute. yes, a specfic attribute is perhaps a better, i wonder if the information in the RecordSignature should not be part of the Extractor. R?mi > > > On 4/8/2018 4:16 PM, Remi Forax wrote: >> Currently annotation values are limited to what is encodable as a constant in >> the constant pool. >> With Condy, we can expand the number of values that can be encodable as a >> constant in the constant pool to the infinity by allowing a reference to any >> non-mutable class to be encoded as an annotation values. >> >> For that we need to have a 'protocol' that >> - encode an instance of a user defined non-mutable class as a condy by the >> compiler. >> - decode an instance of a user defined non-mutable class by the JDK runtime. >> >> Records with their constructors do not provide enough meta-information for that, >> the parameter names of the constructors may not be available at runtime. >> >> So i think the constructors parameter names of a Record should be always >> recorded (as with --parameters was specified for the constructors) to enable >> non-mutable records to be annotation values. >> >> R?mi From forax at univ-mlv.fr Mon Apr 9 11:49:15 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 9 Apr 2018 13:49:15 +0200 (CEST) Subject: Expression switch - an alternate proposal In-Reply-To: References: Message-ID: <1418035131.260541.1523274555469.JavaMail.zimbra@u-pem.fr> moving to spec-experts as it can interest others. Hi Stephen, First, thanks to do a detailed analysis of the rational of your proposal. I think i agree with you about the fact that the expression switch does need to support fallthrough, more on that in a folowing email. I also agree with you that mixing arrows and colons is confusing. I am not sure it's that important to make a string distinction between the statement switch and the expression switch. You do not give any element or why you think it's important and in my opinion, it's the kind of things that you think is important when you introduce the feature and tend to be less important if the feature was not new. Basically, your proposal is to use -> eveywhere, i think i prefer the opposite, do not use arrow at all. Using arrow in this context is disturbing because it doesn't mean the same things if it's the arrow of the lambda or the arrow inside an expression switch. As i know that you love puzzlers, how about ? int a = 0; switch(x) { case 0 -> { a = 3 }; case 1 -> () -> { a = 3 }; } or this one switch(x) { case 0 -> { break 3; } case 1 -> () -> { break 3; }; case 2 -> { return 3; } case 3 -> () -> { return 3; }; } the problem is that currently -> means create a new function scope and not creates a new code scope. So if do not mixing arrows and colons is an important goal and i think it is, i think it's better to not use arrow. After all, we need the arrow syntax in lambda only to know if (x) is the start of a lambda or a cast, there is no need to have an arrow in the expression switch. Moreover, do we really need a shorter syntax given that we can use break and a value ? Here is your example with no arrow and no short syntax, var action = switch (light) { case RED: log("Red found"); break "Stop"; case YELLOW, GREEN: break "Go go go"; default: log("WTF: " + light); throw new WtfException("Unexpected color: " + light); }; and now we can discuss about adding a shorter syntax by making break optional if there is one expression. R?mi ----- Mail original ----- > De: "Stephen Colebourne" > ?: "amber-dev" > Envoy?: Lundi 9 Avril 2018 01:58:03 > Objet: Expression switch - an alternate proposal > What follows is a set of changes to the current expression switch > proposal that I believe result in a better outcome. > > The goal is to tackle four specific things (in order): > 1) The context as to whether it is a statement or expression switch > (and thus what is or is not allowed) is too remote/subtle > 2) Mixing arrows and colons is confusing to read > 3) Blocks that do not have a separate scope > 4) Fall through by default > while still keeping the design as a unified switch language feature. > > To tackle #1 and #2, all cases in an expression switch must start with > arrow -> (and all in statement switch must start with colon :) > To tackle #3, all blocks in an expression switch must have braces > To tackle #4, disallow fall through in expression switch (or add a > fallthrough keyword) > > Here is the impact on some code: > > Current: > > var action = switch (light) { > case RED: > log("Red found"); > break "Stop"; > case YELLOW: > case GREEN -> "Go go go"; > default: > log("WTF: " + light); > throw new WtfException("Unexpected color: " + light); > } > > Alternate proposal: > > var action = switch (light) { > case RED -> { > log("Red found"); > break "Stop"; > } > case YELLOW, GREEN -> "Go go go"; > default: -> { > log("WTF: " + light); > throw new WtfException("Unexpected color: " + light); > } > } > > How is this still a unified switch? By observing that switch can be > broken down into two distinct phases: > - matching > - action > What makes it unified is that the matching phase is shared. Where > statement and expression switch differ is in the action phase. > > The unified matching phase includes: > - target expression to switch on > - case null > - constant case clauses > - pattern matching case clauses > - default clause > > The action phase of a statement switch is: > - followed by a colon > - have non-scoped blocks > - fall through by default > - can use return/continue/break > > The action phase of an expression switch is: > - followed by an arrow > - have an expression or a block (aka block-expression) > - cannot fall through > - cannot use return/continue/break > > By having a unified matching phase and a separate (but consistent) > action phase in each form, I believe that the overall language feature > would be much simpler to learn. And importantly, it achieves the goal > of not deprecating or threatening the existence of the classic > statement switch. > > All the key differences are in the action phase, which is clearly > identified by arrow or colon (no remote context). Developers will come > to associate the rule differences between the two forms with the arrow > or colon, while the pattern matching knowledge is shared. > > Of course, the matching phase is not completely unified - expression > switches must be exhaustive, and they may have auto default case > clauses. (Perhaps the unified matching phase mental model suggests > that auto default would be better written explicitly, eg. "default > throw;", which could then apply to both statement and expression. Not > sure.) > > I hope this alternate proposal is clear. To me, the split between a > unified matching phase and a consistent but different action phase > clearly identified in syntax results in much better readability, > learning and understandability. > > Stephen > PS. I think there are alternate block expression syntaxes, including > ones that avoid "break expression", but I've chosen to avoid that > bikeshed and use the closest one to the current proposal for the > purpose of this mail From forax at univ-mlv.fr Mon Apr 9 11:55:12 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 9 Apr 2018 13:55:12 +0200 (CEST) Subject: Expression switch - an alternate proposal In-Reply-To: References: Message-ID: <1415747104.263756.1523274912104.JavaMail.zimbra@u-pem.fr> Do we need fallthrough in an expression switch, i believe like Stephen that we don't. First, as Stephen point in it's example, if we have comma separated cases, we need less fallthrough and even if we have a code like this var value = switch(x) { case 0: foo(); case 1: bar(); break 42; }; one can always rewrite it with comma separated cases and an if var value = switch(x) { case 0, 1: if (x == 0) { foo(); } bar(); break 42; }; There is another reason to not allow fallthrough, we have rule out all goto-related syntax like break label/continue label from the expression switch, but we should still allow a fallthrough which is a goto to the next basic block. I think we should be coherent here and do not allow the fallthrough in the expression switch. R?mi ----- Mail original ----- > De: "Stephen Colebourne" > ?: "amber-dev" > Envoy?: Lundi 9 Avril 2018 01:58:03 > Objet: Expression switch - an alternate proposal > What follows is a set of changes to the current expression switch > proposal that I believe result in a better outcome. > > The goal is to tackle four specific things (in order): > 1) The context as to whether it is a statement or expression switch > (and thus what is or is not allowed) is too remote/subtle > 2) Mixing arrows and colons is confusing to read > 3) Blocks that do not have a separate scope > 4) Fall through by default > while still keeping the design as a unified switch language feature. > > To tackle #1 and #2, all cases in an expression switch must start with > arrow -> (and all in statement switch must start with colon :) > To tackle #3, all blocks in an expression switch must have braces > To tackle #4, disallow fall through in expression switch (or add a > fallthrough keyword) > > Here is the impact on some code: > > Current: > > var action = switch (light) { > case RED: > log("Red found"); > break "Stop"; > case YELLOW: > case GREEN -> "Go go go"; > default: > log("WTF: " + light); > throw new WtfException("Unexpected color: " + light); > } > > Alternate proposal: > > var action = switch (light) { > case RED -> { > log("Red found"); > break "Stop"; > } > case YELLOW, GREEN -> "Go go go"; > default: -> { > log("WTF: " + light); > throw new WtfException("Unexpected color: " + light); > } > } > > How is this still a unified switch? By observing that switch can be > broken down into two distinct phases: > - matching > - action > What makes it unified is that the matching phase is shared. Where > statement and expression switch differ is in the action phase. > > The unified matching phase includes: > - target expression to switch on > - case null > - constant case clauses > - pattern matching case clauses > - default clause > > The action phase of a statement switch is: > - followed by a colon > - have non-scoped blocks > - fall through by default > - can use return/continue/break > > The action phase of an expression switch is: > - followed by an arrow > - have an expression or a block (aka block-expression) > - cannot fall through > - cannot use return/continue/break > > By having a unified matching phase and a separate (but consistent) > action phase in each form, I believe that the overall language feature > would be much simpler to learn. And importantly, it achieves the goal > of not deprecating or threatening the existence of the classic > statement switch. > > All the key differences are in the action phase, which is clearly > identified by arrow or colon (no remote context). Developers will come > to associate the rule differences between the two forms with the arrow > or colon, while the pattern matching knowledge is shared. > > Of course, the matching phase is not completely unified - expression > switches must be exhaustive, and they may have auto default case > clauses. (Perhaps the unified matching phase mental model suggests > that auto default would be better written explicitly, eg. "default > throw;", which could then apply to both statement and expression. Not > sure.) > > I hope this alternate proposal is clear. To me, the split between a > unified matching phase and a consistent but different action phase > clearly identified in syntax results in much better readability, > learning and understandability. > > Stephen > PS. I think there are alternate block expression syntaxes, including > ones that avoid "break expression", but I've chosen to avoid that > bikeshed and use the closest one to the current proposal for the > purpose of this mail From brian.goetz at oracle.com Mon Apr 9 13:30:07 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 9 Apr 2018 09:30:07 -0400 Subject: Record and annotation values In-Reply-To: <487962183.23171.1523254586643.JavaMail.zimbra@u-pem.fr> References: <594033480.1352711.1523218614316.JavaMail.zimbra@u-pem.fr> <487962183.23171.1523254586643.JavaMail.zimbra@u-pem.fr> Message-ID: > I agree, i do not think it's a good idea to introduce record as annotation value just because we can but that's not the reason why i think we should introduce record as annotation value. > > There is a lot of time where you can construct an annotation with invalid values but those invalid annotation will not be catch at compile time but at runtime. > By example, > @Test({ignore=false, timeout_value=-3, timeout_unit=SECOND}) > > So having a way to specify a contract for annotation values will make Java more safe. This is why it is best to start with problems first, rather than solutions.? It was far from obvious that this was your underlying motivation, and given this motivation, its far from obvious this is the best way to get there. So let's start over: the problem you're trying to solve is that there is not a good way currently to do compile- or run-time annotation validation? > yes, a specfic attribute is perhaps a better, > i wonder if the information in the RecordSignature should not be part of the Extractor. > I think the containment here is backwards.? An extractor is a lower-level mechanism for implementing conditional deconstruction, which includes pattern matching.? A class can have multiple extractors (patterns), just as it can have multiple constructors. Records have a distinguished extractor (the primary deconstructor pattern), just as they have a distinguished constructor.? It should be possible to reflectively navigate from a record class to its primary ctor/dtor.? That might be by referring to them from the RecordSignature, or might be some other way (e.g., an invariant that you can use the record signature as an input to findConstructor / findDeconstructionPattern.) From brian.goetz at oracle.com Mon Apr 9 13:38:15 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 9 Apr 2018 09:38:15 -0400 Subject: Switch on java.lang.Class In-Reply-To: References: Message-ID: <3510133d-4147-fab9-366f-6ea42b523c4b@oracle.com> I'm skeptical of this feature, because (a) its not as widely applicable as it looks, (b) its error-prone. Both of these stem from the fact that comparing classes with == excludes subtypes.? So it really only works with final classes -- but if we had a feature like this, people might mistakenly use it with nonfinal classes, and be surprised when a subtype shows up (this can happen even when your IDE tells you there are no subtypes, because of dynamic proxies).? And all of the examples you show are in low-level libraries, which is a warning sign. Where did these snippets get their Class from?? Good chance, case 1 got it from calling Object.getClass().? In which case, they can just pattern match on the type of the thing: ??? switch (date) { ??????? case Date d: ... ??????? case Timestamp t: ... ??????? default: ... ??? } Case 2 is more likely just operating on types that it got from a reflection API.? If you have only a few entries, an if-else will do; if you have more entries, a Map is likely to be the better choice.? For situations like this, I'd rather invest in map literals or better Map.of() builders. So, I would worry this feature is unlikely to carry its weight, and further, may lead to misuse. On 4/9/2018 1:07 AM, Tagir Valeev wrote: > Hello! > > I don't remember whether switch on java.lang.Class instance was > discussed. I guess, this pattern is quite common and it will be useful > to support it. Such code often appears in deserialization logic when > we branch on desired type to deserialize. Here are a couple of > examples from opensource libraries: > > 1. com.google.gson.DefaultDateTypeAdapter#read (gson-2.8.2): > > ? ? Date date = deserializeToDate(in.nextString()); > ? ? if (dateType == Date.class) { > ? ? ? return date; > ? ? } else if (dateType == Timestamp.class) { > ? ? ? return new Timestamp(date.getTime()); > ? ? } else if (dateType == java.sql.Date.class) { > ? ? ? return new java.sql.Date(date.getTime()); > ? ? } else { > ? ? ? // This must never happen: dateType is guarded in the primary > constructor > ? ? ? throw new AssertionError(); > ? ? } > > Could be rewritten as: > > ? ? Date date = deserializeToDate(in.nextString()); > ? ? return switch(dateType) { > ? ? ? case Date.class -> date; > ? ? ? case Timestamp.class -> new Timestamp(date.getTime()); > ? ? ? case java.sql.Date.class -> new java.sql.Date(date.getTime()); > ? ? ? default -> > ? ? ? ? // This must never happen: dateType is guarded in the primary > constructor > ? ? ? ? throw new AssertionError(); > ? ? }; > > 2. > com.fasterxml.jackson.databind.deser.std.FromStringDeserializer#findDeserializer > (jackson-databind-2.9.4): > > ? ? public static Std findDeserializer(Class rawType) > ? ? { > ? ? ? ? int kind = 0; > ? ? ? ? if (rawType == File.class) { > ? ? ? ? ? ? kind = Std.STD_FILE; > ? ? ? ? } else if (rawType == URL.class) { > ? ? ? ? ? ? kind = Std.STD_URL; > ? ? ? ? } else if (rawType == URI.class) { > ? ? ? ? ? ? kind = Std.STD_URI; > ? ? ? ? } else if (rawType == Class.class) { > ? ? ? ? ? ? kind = Std.STD_CLASS; > ? ? ? ? } else if (rawType == JavaType.class) { > ? ? ? ? ? ? kind = Std.STD_JAVA_TYPE; > ? ? ? ? } else if // more branches like this > ? ? ? ? } else { > ? ? ? ? ? ? return null; > ? ? ? ? } > ? ? ? ? return new Std(rawType, kind); > ? ? } > > Could be rewritten as: > > ? ? public static Std findDeserializer(Class rawType) > ? ? { > ? ? ? ? int kind = switch(rawType) { > ? ? ? ? case File.class -> Std.STD_FILE; > ? ? ? ? case URL.class -> Std.STD_URL; > ? ? ? ? case URI.class -> Std.STD_URI; > ? ? ? ? case Class.cass -> Std.STD_CLASS; > ? ? ? ? case JavaType.class -> Std.STD_JAVA_TYPE; > ? ? ? ? ... > ? ? ? ? default -> 0; > ? ? ? ? }; > ? ? ? ? return kind == 0 ? null : new Std(rawType, kind); > ? ? } > > In such code all branches are mutually exclusive. The bootstrap method > can generate a lookupswitch based on Class.hashCode, then equals > checks, pretty similar to String switch implementation. Unlike String > hash codes Class.hashCode is not stable and varies between JVM > launches, but they are already known during the bootstrap and we can > trust them during the VM lifetime, so we can generate a lookupswitch. > The minor problematic point is to support primitive classes like > int.class. This cannot be passed directly as indy static argument, but > this can be solved with condy. > > What do you think? > > With best regards, > Tagir Valeev. > From brian.goetz at oracle.com Mon Apr 9 15:03:12 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 9 Apr 2018 11:03:12 -0400 Subject: Expression switch - an alternate proposal In-Reply-To: <1418035131.260541.1523274555469.JavaMail.zimbra@u-pem.fr> References: <1418035131.260541.1523274555469.JavaMail.zimbra@u-pem.fr> Message-ID: <43593260-3529-94f9-a55a-f568c5fec5f7@oracle.com> > I think i agree with you about the fact that the expression switch does need to support fallthrough, > more on that in a folowing email. I've been leaving this topic until we have ironed out the higher-order bits, but this seems a good enough time to start this discussion. > I also agree with you that mixing arrows and colons is confusing. I agree this is confusing, but I think it is also not likely to be something people do naturally -- because the -> form, where it is applicable, is so much more attractive -- so the risk of confusion is low.?? Just as style guides say to users "if you're going to use fall through, label it clearly", and most code does, style guides will guide users away from this confusion. > Basically, your proposal is to use -> eveywhere, i think i prefer the opposite, do not use arrow at all. > Using arrow in this context is disturbing because it doesn't mean the same things if it's the arrow of the lambda or the arrow inside an expression switch. This is a reasonable alternative, but I don't think it would be very popular.? I think people will really love being able to write: ??? case MONDAY -> 1; ??? case TUESDAY -> 2; and will be sad if we make them write ??? case MONDAY: break 1; ??? case TUESDAY: break 2; Not only will they be said, but they will point out that the "obvious" answer was in front of our noses, and we did something different just to be different.? (You can easily imagine the "There those Java guys go again, verbosity for its own sake" rants, but this time they might actually be right, rather than the folks who can't spell "migration compatibility" complaining about erasure.) > the problem is that currently -> means create a new function scope and not creates a new code scope. I think the scopes issue is a red herring. > So if do not mixing arrows and colons is an important goal and i think it is, i think it's better to not use arrow. Or just: avoid mixing arrows and colons. > Moreover, do we really need a shorter syntax given that we can use break and a value ? I suggest you do this poll at Devoxx.? Make sure to wear flame-proof pants! > and now we can discuss about adding a shorter syntax by making break optional if there is one expression. ... which we expect to be true almost all the time. From brian.goetz at oracle.com Mon Apr 9 19:14:47 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 9 Apr 2018 15:14:47 -0400 Subject: Switch expressions -- gathering the threads Message-ID: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> There's been some active discussion on "Is this the switch expression construct we're looking for" over on amber-dev.? Its a good time to take stock of where we are, and identifying any loose ends. ## Approach Our approach is driven not merely by the desire to have an expression form of switch, but to make switch more generally useful as a multi-way conditional construct.? The biggest driver here of course is making it work well with pattern matching. Pattern matching is a driver for better handling of nulls and primitives (though these are also useful on their own); additionally, the more useful we make switch, the more obvious the cumbersomeness of its statement-orientation becomes.? Pattern matching also pushes hard on the somewhat unfortunate scoping behavior; a straightforward interpretation of existing scoping of locals in switch would not be very good for pattern bindings. At first, given all the constraints of existing switches, we thought it unlikely that we'd be able to get away with teaching switch some new tricks, and would have to create a new construct (say, "match").? Bit by bit, though, we were able to chip away at the accidental complexity of the { constants, patterns } x { statement, expression } space, to the point where it seemed practical to unify the construct. Having a single construct has pros and cons. On the other hand, entities should not be multipled without necessity; on the other, a one-size-fits-all construct might exhibit schizoid behavior.? And the switch statement probably has more unusual (some would say objectionable) behaviors than any other Java construct, putting us in tension between compatibility and perceived complexity. ## Current proposal The current proposal starts with existing statement switch, extending `break` to support a value, and requiring that the value-ness of the break match the value-ness of the switch (just as return must with methods or lambdas).? We also slightly adjust the rules regarding nonlocal control flow _through_ a switch switch.? Because expression switches are expressions, they must be total.? For expression switches over enums and sealed types, we have the option to infer a throwing default when all sealed members are provided. We then offer a shorthand form for case labels in expression switches, that: ??? case P -> e; is shorthand for ??? case P: break e; This leaves the following differences between expression switches and statement switches: ?- Expression switches are required to be exhaustive; statement switches cannot be required to be exhaustive. ?- Expression switches permit the `->` shorthand form. ?- Expression switches may restrict fallthrough in some way, or may not, TBD. ?- You can `return` and `continue` out of a statement switch, but not out of an expression switch (like lambdas.) ?- You cannot `break` or `continue` _through_ an expression switch (like lambdas and conditionals.) And leaves some open issues for discussion: ?- We have some options as to whether to restrict fallthrough in expression switches, and also whether to restrict fallthrough into patterns. ?- We have the option to try and give the `->` form some meaning in statement switches. ## Commentary The concerns raised so far mostly revolve around potential confusion.? Because the two forms are mostly alike, but have subtle differences, the fear is this will lead to confusion. Various schemes have been suggested to make them look more different, or to make them behave more different, to make it more clear where the lines are. For example, the following have been cited: ?- Saying `break expression` is ugly, or confusable for a labeled break; ?- Concerns that fallthrough-by-default is an even worse default for expression switches than for statements (and, if we restrict fallthrough in switch expression, the gap between the forms grows); ?- The asymmetry of the implicit throwing default in apparently-exhaustive enum switches will be a sharp edge; ?- That a user might not be able to tell, by looking at the middle of a large switch, whether its an expression or statement switch? ?- The possibility people will write code with mixed label forms (colon and arrow) seems to scare the heck out of people; ?- The arrows might confuse people with similarity to lambdas. My reaction to most of these is "meh".? I think the arrow-form is going to be so preferable that the risk of fallthrough will be low (because there are few statements in the first place), and can be lowered further with restrictions; similarly, I think unrestricted mixing of arrow and colon forms will be quite rare (except for the case where there is one catch-all case, often a default, which will take statement form, which seems mostly harmless), and strongly discouraged.? And that means that the confusion between expression and statement will be nonexistent -- because the expression ones will have arrows and the statement ones will not. There are also a number of calls for "If X is rare, just disallow X", where X could be a statement-plus-expression form in expression switches or mixed label forms in one switch.? The problem is that they are usually not rare _enough_ that their lack would not cause a different kind of backlash. #### Some alternatives that have been suggested **Separate keyword.**? Having a separate keyword ("choose") for expression switch seems like it should dispel all the "but people will be confused" issues, but I'm not sure it actually will. Because the two constructs will still be so similar, the differences will likely still be surprises to people. It is also not a magic wand; we still have to figure out how to deal with statement+expression compounds, and doesn't automatically rule out the "mixed colons and arrows" problem. **Block expression**.? For the "mixed colons and arrows" problem, several have suggested some sort of ad-hoc, switch-specific block expression, but from a language evolution perspective, I think this is a cure is worse than the disease.? Having an ad-hoc form just for switch is terrible, and adding a general block expression form to the language is not where we want to go -- and doing it to avoid the perception of rampant mixed colons-and-arrows would be killing a dust mite with a napalm blast. **No colons in expression switch.**? Without a block expression, this is a non-starter; there are way too many legitimate uses for compound expressions in expressions witches. **No mixed colons and arrows**.? This will be intensely irritating to users; if you add one compound expression in a 50-way switch, you have to change 49 others from the nice form to the nasty one. ## Open issues The main issue we need to address is whether we want to restrict fallthrough in expression switches (or in the extreme case, prohibit it entirely.) One argument why fallthrough might be desirable is that some existing statement switches that make use of fallthrough (such as string or packet parsers) could become expression switches; these frequently have a "main result" they want to return (such as the index of the next character), while at the same time recording some side state about the context.? Refactoring these to expression switches could be beneficial just as it is for many other statement switches.? On the other hand, it would also be reasonable say we should leave these cases in statement-world where they are now. A form of fallthrough that I think may be more common in expression switches is when something wants to fall _into_ the default: ??? int x = switch (y) { ??????? case "Foo" -> 1; ??????? case "Bar" -> 2; ??????? case null: ??????? default: ??????????? // handle exceptional case here ??? } Because `default` is not a pattern, we can't say: ??? case null, default: here.? (Well, we could make it one.)? Though we could carve out an exception for such "trivial" fallthrough. I think a reasonable restriction that might preserve flexibility while avoiding most accidental uses is to make it illegal to fall _into_ an arrow-labeled case; if you want fallthrough, stay in colon-world.? (It's impossible to fall _out of_ an arrow case.) Given that most users would rather live in arrow-world, this means that for practical purposes, there's no fallthrough in expression switches at all, but advanced users have a fallback that works just like the switch and fallthrough they've always known. While it is not specific to expression vs statement switch, we should also ask whether we want to restrict fallthrough into certain kinds of pattern labels (i.e., those without binding variables), even in statement switch.? (I don't really see the point, though; I don't see a path to getting rid of the breaks, which would be the real payoff.)? Further, because of the intersection rules about OR pattern, its more likely an accidental fallthrough from one pattern label to another would result in a compile error anyway. #### -> in statement switch Finally, people have asked about whether we should consider allowing `->` for statement switches too (perhaps on the theory that they're kind of like void-valued expression switches.)? I see the attraction here -- when the majority of actions are single-line, this would be a winner, and you could drop the breaks.? However, because the distribution of statement count in switch arms is all over the map, this would dramatically increase the the prevalence of mixed colon-and-arrow switches, and probably further exposing people to the risk of accidental fallthrough, as now break is needed sometimes and not others _in the same statement switch_. From amaembo at gmail.com Tue Apr 10 05:12:59 2018 From: amaembo at gmail.com (Tagir Valeev) Date: Tue, 10 Apr 2018 12:12:59 +0700 Subject: Switch on java.lang.Class In-Reply-To: <3510133d-4147-fab9-366f-6ea42b523c4b@oracle.com> References: <3510133d-4147-fab9-366f-6ea42b523c4b@oracle.com> Message-ID: Hello! Does not sound convincing. First, to my experience, it's quite widely applicable. My first two samples were from what you called a low-level libraries just because I wanted to grep something well-known. Now I grepped jdk10 source by `\w+ == \w+\.class` and scanned manually about 10% of the results and found about 10 places where it's useful (some examples are shown below). So extrapolating I may assume that this construct can be applied roughly 100 times in JDK (note that my regexp does not cover xyz.equals(foo.class) and some developers prefer this style; also different spacing is not covered). You may surely call the JDK code as "low-level libraries", but grepping IntelliJ IDEA source code I also see significant amount of occurrences. Though I don't see why usefulness of the feature in a low-level libraries should be the warning sign. In any case I'm pretty sure that switch on class will be more applicable, than the switch on floats. But you are doing the switch on floats. Why? For consistency, of course. You want to support all literals in switch. But class literals are also literals, according to JLS 15.8.2, so it is inconsistent not to support them (especially taking into account that their usefulness is not the lowest of all possible literals). Another comparison: all literals (including class literals) and enum values are acceptable as annotation values. The same in switch expressions, but excluding the class literals, which is inconsistent. I don't buy an error-prone argument either. Is `switch(doubleValue) {case Math.PI: ...}` error-prone? Why somebody cannot assume that the comparison should tolerate some delta difference between doubleValue and Math.PI? Somebody surely can, but that's silly. We know that the switch checks for equality, it was always so. It will be so for classes as well, and assuming something different is inconsistent. After all, writing foo.equals(Bar.class) or foo == Bar.class is allowed in the language, people use these constructions, and often it's the right thing to do. Of course their code becomes erroneous sometimes, because in this particular place the inheritance should be taken into account. But the same is true for doubleValue == Math.PI comparison: sometimes it's ok, sometimes it's wrong and some tolerance interval should be checked instead. And when it's ok, you add a new option to use switch on doubles. Several code samples found in JDK: 1. javafx.base/javafx/util/converter/LocalDateTimeStringConverter.java:197 (final classes) if (type == LocalDate.class) { return (T)LocalDate.from(chronology.date(temporal)); } else if (type == LocalTime.class) { return (T)LocalTime.from(temporal); } else { return (T)LocalDateTime.from(chronology.localDateTime(temporal)); } 2. java.desktop/sun/print/Win32PrintService.java:928 (final classes) if (category == ColorSupported.class) { int caps = getPrinterCapabilities(); if ((caps & DEVCAP_COLOR) != 0) { return (T)ColorSupported.SUPPORTED; } else { return (T)ColorSupported.NOT_SUPPORTED; } } else if (category == PrinterName.class) { return (T)getPrinterName(); } else if (category == PrinterState.class) { return (T)getPrinterState(); } else if (category == PrinterStateReasons.class) { return (T)getPrinterStateReasons(); } else if (category == QueuedJobCount.class) { return (T)getQueuedJobCount(); } else if (category == PrinterIsAcceptingJobs.class) { return (T)getPrinterIsAcceptingJobs(); } else { return null; } 3. com.sun.media.sound.SoftSynthesizer#getPropertyInfo (line 926) - final classes; several blocks like this if (c == Byte.class) item2.value = Byte.valueOf(s); else if (c == Short.class) item2.value = Short.valueOf(s); else if (c == Integer.class) item2.value = Integer.valueOf(s); else if (c == Long.class) item2.value = Long.valueOf(s); else if (c == Float.class) item2.value = Float.valueOf(s); else if (c == Double.class) item2.value = Double.valueOf(s); 4. java.awt.Component#getListeners (interfaces!), Window#getListeners, List#getListeners, JComponent#getListeners, etc. are similar public T[] getListeners(Class listenerType) { EventListener l = null; if (listenerType == ComponentListener.class) { l = componentListener; } else if (listenerType == FocusListener.class) { l = focusListener; } else if (listenerType == HierarchyListener.class) { l = hierarchyListener; } else if (listenerType == HierarchyBoundsListener.class) { l = hierarchyBoundsListener; } else if (listenerType == KeyListener.class) { l = keyListener; } else if (listenerType == MouseListener.class) { l = mouseListener; } else if (listenerType == MouseMotionListener.class) { l = mouseMotionListener; } else if (listenerType == MouseWheelListener.class) { l = mouseWheelListener; } else if (listenerType == InputMethodListener.class) { l = inputMethodListener; } else if (listenerType == PropertyChangeListener.class) { return (T[])getPropertyChangeListeners(); } return AWTEventMulticaster.getListeners(l, listenerType); } 5. java.beans.XMLEncoder#primitiveTypeFor (final classes) if (wrapper == Boolean.class) return Boolean.TYPE; if (wrapper == Byte.class) return Byte.TYPE; if (wrapper == Character.class) return Character.TYPE; if (wrapper == Short.class) return Short.TYPE; if (wrapper == Integer.class) return Integer.TYPE; if (wrapper == Long.class) return Long.TYPE; if (wrapper == Float.class) return Float.TYPE; if (wrapper == Double.class) return Double.TYPE; if (wrapper == Void.class) return Void.TYPE; return null; 6. javax.swing.plaf.synth.SynthTableUI.SynthTableCellRenderer#configureValue (mix of abstract, non-final and final classes) private void configureValue(Object value, Class columnClass) { if (columnClass == Object.class || columnClass == null) { // case Object.class, null! setHorizontalAlignment(JLabel.LEADING); } else if (columnClass == Float.class || columnClass == Double.class) { if (numberFormat == null) { numberFormat = NumberFormat.getInstance(); } setHorizontalAlignment(JLabel.TRAILING); setText((value == null) ? "" : ((NumberFormat)numberFormat).format(value)); } else if (columnClass == Number.class) { setHorizontalAlignment(JLabel.TRAILING); // Super will have set value. } else if (columnClass == Icon.class || columnClass == ImageIcon.class) { setHorizontalAlignment(JLabel.CENTER); setIcon((value instanceof Icon) ? (Icon)value : null); setText(""); } else if (columnClass == Date.class) { if (dateFormat == null) { dateFormat = DateFormat.getDateInstance(); } setHorizontalAlignment(JLabel.LEADING); setText((value == null) ? "" : ((Format)dateFormat).format(value)); } else { configureValue(value, columnClass.getSuperclass()); // note this: recursively going to superclass automatically } } With best regards, Tagir Valeev. On Mon, Apr 9, 2018 at 8:38 PM, Brian Goetz wrote: > I'm skeptical of this feature, because (a) its not as widely applicable as > it looks, (b) its error-prone. > > Both of these stem from the fact that comparing classes with == excludes > subtypes. So it really only works with final classes -- but if we had a > feature like this, people might mistakenly use it with nonfinal classes, > and be surprised when a subtype shows up (this can happen even when your > IDE tells you there are no subtypes, because of dynamic proxies). And all > of the examples you show are in low-level libraries, which is a warning > sign. > > Where did these snippets get their Class from? Good chance, case 1 got it > from calling Object.getClass(). In which case, they can just pattern match > on the type of the thing: > > switch (date) { > case Date d: ... > case Timestamp t: ... > default: ... > } > > Case 2 is more likely just operating on types that it got from a > reflection API. If you have only a few entries, an if-else will do; if you > have more entries, a Map is likely to be the better choice. For situations > like this, I'd rather invest in map literals or better Map.of() builders. > > So, I would worry this feature is unlikely to carry its weight, and > further, may lead to misuse. > > > > On 4/9/2018 1:07 AM, Tagir Valeev wrote: > >> Hello! >> >> I don't remember whether switch on java.lang.Class instance was >> discussed. I guess, this pattern is quite common and it will be useful to >> support it. Such code often appears in deserialization logic when we branch >> on desired type to deserialize. Here are a couple of examples from >> opensource libraries: >> >> 1. com.google.gson.DefaultDateTypeAdapter#read (gson-2.8.2): >> >> Date date = deserializeToDate(in.nextString()); >> if (dateType == Date.class) { >> return date; >> } else if (dateType == Timestamp.class) { >> return new Timestamp(date.getTime()); >> } else if (dateType == java.sql.Date.class) { >> return new java.sql.Date(date.getTime()); >> } else { >> // This must never happen: dateType is guarded in the primary >> constructor >> throw new AssertionError(); >> } >> >> Could be rewritten as: >> >> Date date = deserializeToDate(in.nextString()); >> return switch(dateType) { >> case Date.class -> date; >> case Timestamp.class -> new Timestamp(date.getTime()); >> case java.sql.Date.class -> new java.sql.Date(date.getTime()); >> default -> >> // This must never happen: dateType is guarded in the primary >> constructor >> throw new AssertionError(); >> }; >> >> 2. com.fasterxml.jackson.databind.deser.std.FromStringDeserializer#findDeserializer >> (jackson-databind-2.9.4): >> >> public static Std findDeserializer(Class rawType) >> { >> int kind = 0; >> if (rawType == File.class) { >> kind = Std.STD_FILE; >> } else if (rawType == URL.class) { >> kind = Std.STD_URL; >> } else if (rawType == URI.class) { >> kind = Std.STD_URI; >> } else if (rawType == Class.class) { >> kind = Std.STD_CLASS; >> } else if (rawType == JavaType.class) { >> kind = Std.STD_JAVA_TYPE; >> } else if // more branches like this >> } else { >> return null; >> } >> return new Std(rawType, kind); >> } >> >> Could be rewritten as: >> >> public static Std findDeserializer(Class rawType) >> { >> int kind = switch(rawType) { >> case File.class -> Std.STD_FILE; >> case URL.class -> Std.STD_URL; >> case URI.class -> Std.STD_URI; >> case Class.cass -> Std.STD_CLASS; >> case JavaType.class -> Std.STD_JAVA_TYPE; >> ... >> default -> 0; >> }; >> return kind == 0 ? null : new Std(rawType, kind); >> } >> >> In such code all branches are mutually exclusive. The bootstrap method >> can generate a lookupswitch based on Class.hashCode, then equals checks, >> pretty similar to String switch implementation. Unlike String hash codes >> Class.hashCode is not stable and varies between JVM launches, but they are >> already known during the bootstrap and we can trust them during the VM >> lifetime, so we can generate a lookupswitch. The minor problematic point is >> to support primitive classes like int.class. This cannot be passed directly >> as indy static argument, but this can be solved with condy. >> >> What do you think? >> >> With best regards, >> Tagir Valeev. >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 10 08:02:24 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 10 Apr 2018 10:02:24 +0200 (CEST) Subject: Expression switch - an alternate proposal In-Reply-To: <43593260-3529-94f9-a55a-f568c5fec5f7@oracle.com> References: <1418035131.260541.1523274555469.JavaMail.zimbra@u-pem.fr> <43593260-3529-94f9-a55a-f568c5fec5f7@oracle.com> Message-ID: <113201964.596558.1523347344352.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "Stephen Colebourne" > Cc: "amber-spec-experts" > Envoy?: Lundi 9 Avril 2018 17:03:12 > Objet: Re: Expression switch - an alternate proposal >> I think i agree with you about the fact that the expression switch does need to >> support fallthrough, >> more on that in a folowing email. > > I've been leaving this topic until we have ironed out the higher-order > bits, but this seems a good enough time to start this discussion. > >> I also agree with you that mixing arrows and colons is confusing. > I agree this is confusing, but I think it is also not likely to be > something people do naturally -- because the -> form, where it is > applicable, is so much more attractive -- so the risk of confusion is > low.?? Just as style guides say to users "if you're going to use fall > through, label it clearly", and most code does, style guides will guide > users away from this confusion. > >> Basically, your proposal is to use -> eveywhere, i think i prefer the opposite, >> do not use arrow at all. >> Using arrow in this context is disturbing because it doesn't mean the same >> things if it's the arrow of the lambda or the arrow inside an expression >> switch. > > This is a reasonable alternative, but I don't think it would be very > popular.? I think people will really love being able to write: > > ??? case MONDAY -> 1; > ??? case TUESDAY -> 2; > > and will be sad if we make them write > > ??? case MONDAY: break 1; > ??? case TUESDAY: break 2; > > Not only will they be said, but they will point out that the "obvious" > answer was in front of our noses, and we did something different just to > be different.? (You can easily imagine the "There those Java guys go > again, verbosity for its own sake" rants, but this time they might > actually be right, rather than the folks who can't spell "migration > compatibility" complaining about erasure.) Apart from the semantics difference between -> inside a lambda and -> inside a case, the fact that you can use -> but not -> { } let me think that if we need a shorter syntax, a one that use -> is not the best one. > >> the problem is that currently -> means create a new function scope and not >> creates a new code scope. > > I think the scopes issue is a red herring. > >> So if do not mixing arrows and colons is an important goal and i think it is, i >> think it's better to not use arrow. > > Or just: avoid mixing arrows and colons. That's may be hard, if you take ASM code as an example, we have two kind of switchs, low level ones to parse method descriptor, generics signature, etc that will continue to use the statement descriptor and "association" switch, that associate a value to another value, when ASM transforms the high level Visitor API to low level bytecodes or when ASM does abstract analysis like computing the stack frames, those can be transformed to expression switch but if you take a look to these switch, usually there is do computation/allocations so written as an expression switch, there will be case with one single expression (most of them) but also one or two cases by switch that will assign a local variable, so with the current proposed syntax, it means mixing arrows and colons. I'm sure there are other shorter syntax possible that does not use ->, technically we do even need the symbol ->, so why not just use ':' as a shorter syntax. You may think that it means that the grammar as to be smarter to distinguish between a single expression and a statement that may be followed by other statements but you can parse everything as statements and in a later phase if there is only one expression consider it as a break expression. The main drawback i see by not having to use '->' in the grammar is that you can not allow fallthrough but i think we should disable fallthrough in an expression switch anyway. So in term of design, i see it in the opposite way, the fact that we do not allow fallthrough allow us more degree of freedom in term of syntax so let us use a more regular syntax by avoiding to introduce '->'. I think not introducing -> as also the nice effect of making the expression switch less alien compared to the statement switch because it remove one of the syntactic difference between them. > >> Moreover, do we really need a shorter syntax given that we can use break and a >> value ? > > I suggest you do this poll at Devoxx.? Make sure to wear flame-proof pants! I have a 3 hours session with Jos? Paumard at Devoxx France (only 3000 attendees, so a little smaller than the real Devoxx in Belgium) next week on amber and valhalla. So i will run the poll, we will see. For the pants, i've a plan :) > >> and now we can discuss about adding a shorter syntax by making break optional if >> there is one expression. > > ... which we expect to be true almost all the time. R?mi From brian.goetz at oracle.com Tue Apr 10 12:25:48 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 10 Apr 2018 08:25:48 -0400 Subject: Switch on java.lang.Class In-Reply-To: References: <3510133d-4147-fab9-366f-6ea42b523c4b@oracle.com> Message-ID: <0d10a4b9-8153-3c1e-9981-97c38a9dd16a@oracle.com> > Though I don't see why usefulness of the feature in a low-level > libraries should be the warning sign. I don't dispute it's usefulness, but I'm sure you agree that "usefulness > 0" is not the bar for putting a feature in the language -- it is way, way higher than that. My bigger concern is that it is error-prone -- especially if we release type-test patterns and Class constant patterns at the same time.? Many users will be tempted to use Class patterns as a less ugly alternative to instanceof tests -- and then their code will be subtly wrong.? When given the choice of what looks like "old fashioned switch with a new type", and "new-fangled type-test patterns", many users will lean towards the former because its familiar.? And get the wrong thing. So the problem is not that its only useful to low-level users; its that others users may be tempted to use it, and get the wrong thing.? (This isn't theoretical.? "Type Switch" has been an RFE for years; in the examples presented as justification for the feature, many wrongly conflate it with instanceof.) > In any case I'm pretty sure that switch on class will be more > applicable, than the switch on floats. But you are doing theswitch on > floats. Why? For consistency, of course. You want to support all > literals in switch. But class literals are also literals, according to > JLS 15.8.2, so it is inconsistent not to support them (especially > taking into account that their usefulness is not the lowest of all > possible literals). "For consistency" arguments are always weak justifications for including a feature, because you can always find a precedent or rule to be consistent with.? The justification is not merely for consistency; it is to avoid introducing _new_ asymmetries.? It would be silly to not allow float literals as patterns; then you couldn't match against `Complex(0.0f, 0.0f)`. So the choice is not about float in _switch_, but about float as a _pattern_.? Once you admit the latter, it is hard to say no to the former.? (Same with null; we're not doing `case null` for its own sake, its to support `null` as a pattern; using it in `case` is a consequence of that.) So, should class literals be a pattern?? That would also mean that you could say ??? if (x instanceof Foo.class) { ... } and it would mean something subtly different than ??? if (x instanceof Foo) { ... } That's not so good.? Or the same mistake in switch: ??? switch (anObject) { ??????? case "Foo": ??????? case String.class: ??????? ... ??? } which means "does anObject equal the constant String.class".? The confusion between `String` and `String.class` as patterns is a pretty serious risk.? And again, introducing both kinds of patterns at once makes it worse. So, while I think its a consistent and useful feature, I also don't think its a necessarily good idea to expose everyone to this confusion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Tue Apr 10 19:34:13 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 10 Apr 2018 13:34:13 -0600 Subject: Switch expressions -- gathering the threads In-Reply-To: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> References: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> Message-ID: <55F4951B-277D-4AD9-A96E-DE36406C6ACB@oracle.com> > On Apr 9, 2018, at 1:14 PM, Brian Goetz wrote: > > A form of fallthrough that I think may be more common in expression switches is when something wants to fall _into_ the default: > > int x = switch (y) { > case "Foo" -> 1; > case "Bar" -> 2; > > case null: > default: > // handle exceptional case here > } > > Because `default` is not a pattern, we can't say: > > case null, default: > > here. (Well, we could make it one.) Though we could carve out an exception for such "trivial" fallthrough. As a matter of terminology, I think it would be helpful for us to not call this fallthrough at all. It creates a lot of confusion when somebody is making an assertion about fallthrough, and it's unclear whether this kind of thing is being included or not. JLS is a good guide: grammatically, the body of a switch statement is a sequence of SwitchBlocks, each of which has a sequence of SwitchLabels followed by some BlockStatements. https://docs.oracle.com/javase/specs/jls/se10/html/jls-14.html#jls-14.11 JLS doesn't formally define the concept of "fallthrough" but I suggest we use it to describe the situation in which control flows from one SwitchBlock to another. What you've illustrated is instead a "switch case with multiple labels"?something deserving scrutiny on its own, but really a different sort of problem than fallthrough. ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 10 20:30:53 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 10 Apr 2018 22:30:53 +0200 (CEST) Subject: Switch expressions -- gathering the threads In-Reply-To: <55F4951B-277D-4AD9-A96E-DE36406C6ACB@oracle.com> References: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> <55F4951B-277D-4AD9-A96E-DE36406C6ACB@oracle.com> Message-ID: <1288016983.922021.1523392253596.JavaMail.zimbra@u-pem.fr> > De: "daniel smith" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 10 Avril 2018 21:34:13 > Objet: Re: Switch expressions -- gathering the threads >> On Apr 9, 2018, at 1:14 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> A form of fallthrough that I think may be more common in expression switches is >> when something wants to fall _into_ the default: >> int x = switch (y) { >> case "Foo" -> 1; >> case "Bar" -> 2; >> case null: >> default: >> // handle exceptional case here >> } >> Because `default` is not a pattern, we can't say: >> case null, default: >> here. (Well, we could make it one.) Though we could carve out an exception for >> such "trivial" fallthrough. > As a matter of terminology, I think it would be helpful for us to not call this > fallthrough at all. It creates a lot of confusion when somebody is making an > assertion about fallthrough, and it's unclear whether this kind of thing is > being included or not. > JLS is a good guide: grammatically, the body of a switch statement is a sequence > of SwitchBlocks, each of which has a sequence of SwitchLabels followed by some > BlockStatements. > [ https://docs.oracle.com/javase/specs/jls/se10/html/jls-14.html#jls-14.11 | > https://docs.oracle.com/javase/specs/jls/se10/html/jls-14.html#jls-14.11 ] > JLS doesn't formally define the concept of "fallthrough" but I suggest we use it > to describe the situation in which control flows from one SwitchBlock to > another. > What you've illustrated is instead a "switch case with multiple > labels"?something deserving scrutiny on its own, but really a different sort of > problem than fallthrough. > ?Dan I'm not sure this difference is important. What about the example below, multiple labels or a fallthrough ? switch(x) { case 0: ; case 1: } regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 10 20:38:54 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 10 Apr 2018 16:38:54 -0400 Subject: Annos on records (was: Records -- Using them as JPA entities and validating them with Bean Validation) In-Reply-To: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> References: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> Message-ID: [ moving to amber-spec-experts] I tend to agree.? It will take longer to adopt, but it _is_ a new kind of target in a source file, and then frameworks can decide what it should mean, and then there's no confusion. It's possible, too, as a migration move, to split the difference, though I'm not sure its worth it -- add a new target, _and_, if the target includes param/field/method, but does _not_ include record, then lower the anno onto all applicable members. On 4/10/2018 1:34 PM, Remi Forax wrote: > No, not right for me, > a new Annotation target is better so each framework can decide what it means for its annotation. > > It will slow the adoption but it's better in the long term. > > R?mi > > ----- Mail original ----- >> De: "Kevin Bourrillion" >> ?: "Gunnar Morling" >> Cc: "amber-dev" >> Envoy?: Mardi 10 Avril 2018 19:25:57 >> Objet: Re: Records -- Using them as JPA entities and validating them with Bean Validation >> On Mon, Apr 9, 2018 at 1:39 PM, Gunnar Morling wrote: >> >>> * Annotation semantics: I couldn't find any example of records with >>> annotations, but IIUC, something like >>> >>> @Entity record Book(@Id long id, String isbn) { ... } >>> >>> would desugar into >>> >>> class @Entity public class Book { private @Id long id, private >>> String isbn; ... }; >>> >>> For the JPA entity use case it'd be helpful to have an option to lift >>> annotations to the corresponding getters instead of the fields (as the >>> location of the @Id annotation controls the default strategy -- field vs. >>> property -- for reading/writing entity state). Similarly, Bean Validation >>> would benefit from such option. >>> >> My assumption has been that we would allow an annotation on a record >> parameter as long as it has *any of *{FIELD,METHOD,PARAMETER} as target, >> and that the annotation would be automatically propagated to each >> synthesized element it applies to. Does this sound about right to everyone? >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From kevinb at google.com Tue Apr 10 20:42:24 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 10 Apr 2018 13:42:24 -0700 Subject: Annos on records (was: Records -- Using them as JPA entities and validating them with Bean Validation) In-Reply-To: References: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> Message-ID: If we create a new ElementType.RECORD, the annotation in question won't even be *able *to add that target type until it is ready to *require* JDK 13 (or whatever) as its new minimum version. On Tue, Apr 10, 2018 at 1:38 PM, Brian Goetz wrote: > [ moving to amber-spec-experts] > > I tend to agree. It will take longer to adopt, but it _is_ a new kind of > target in a source file, and then frameworks can decide what it should > mean, and then there's no confusion. > > It's possible, too, as a migration move, to split the difference, though > I'm not sure its worth it -- add a new target, _and_, if the target > includes param/field/method, but does _not_ include record, then lower the > anno onto all applicable members. > > On 4/10/2018 1:34 PM, Remi Forax wrote: > >> No, not right for me, >> a new Annotation target is better so each framework can decide what it >> means for its annotation. >> >> It will slow the adoption but it's better in the long term. >> >> R?mi >> >> ----- Mail original ----- >> >>> De: "Kevin Bourrillion" >>> ?: "Gunnar Morling" >>> Cc: "amber-dev" >>> Envoy?: Mardi 10 Avril 2018 19:25:57 >>> Objet: Re: Records -- Using them as JPA entities and validating them >>> with Bean Validation >>> On Mon, Apr 9, 2018 at 1:39 PM, Gunnar Morling >>> wrote: >>> >>> * Annotation semantics: I couldn't find any example of records with >>>> annotations, but IIUC, something like >>>> >>>> @Entity record Book(@Id long id, String isbn) { ... } >>>> >>>> would desugar into >>>> >>>> class @Entity public class Book { private @Id long id, private >>>> String isbn; ... }; >>>> >>>> For the JPA entity use case it'd be helpful to have an option to >>>> lift >>>> annotations to the corresponding getters instead of the fields (as the >>>> location of the @Id annotation controls the default strategy -- field >>>> vs. >>>> property -- for reading/writing entity state). Similarly, Bean >>>> Validation >>>> would benefit from such option. >>>> >>>> My assumption has been that we would allow an annotation on a record >>> parameter as long as it has *any of *{FIELD,METHOD,PARAMETER} as target, >>> and that the annotation would be automatically propagated to each >>> synthesized element it applies to. Does this sound about right to >>> everyone? >>> >>> >>> -- >>> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com >>> >> > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 10 20:53:43 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 10 Apr 2018 16:53:43 -0400 Subject: Annos on records In-Reply-To: References: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> Message-ID: MR-JARs busts that restriction; in the main part of your jar, you have ??? @Target(A, B) ??? @interface Foo { } and in the 13 section you have ??? @Target(A, B, RECORD) ??? @interface Foo { } (that's not the whole thing, but it means you only need wait until you _accept_ 13 rather than _require_ 13.) On 4/10/2018 4:42 PM, Kevin Bourrillion wrote: > If we create a new ElementType.RECORD, the annotation in question > won't even be /able /to add that target type until it is ready to > /require/?JDK 13 (or whatever) as its new minimum version. > > > On Tue, Apr 10, 2018 at 1:38 PM, Brian Goetz > wrote: > > [ moving to amber-spec-experts] > > I tend to agree.? It will take longer to adopt, but it _is_ a new > kind of target in a source file, and then frameworks can decide > what it should mean, and then there's no confusion. > > It's possible, too, as a migration move, to split the difference, > though I'm not sure its worth it -- add a new target, _and_, if > the target includes param/field/method, but does _not_ include > record, then lower the anno onto all applicable members. > > On 4/10/2018 1:34 PM, Remi Forax wrote: > > No, not right for me, > a new Annotation target is better so each framework can decide > what it means for its annotation. > > It will slow the adoption but it's better in the long term. > > R?mi > > ----- Mail original ----- > > De: "Kevin Bourrillion" > > ?: "Gunnar Morling" > > Cc: "amber-dev" > > Envoy?: Mardi 10 Avril 2018 19:25:57 > Objet: Re: Records -- Using them as JPA entities and > validating them with Bean Validation > On Mon, Apr 9, 2018 at 1:39 PM, Gunnar Morling > > wrote: > > ? ?* Annotation semantics: I couldn't find any example > of records with > annotations, but IIUC, something like > > ? ? ? ? ?@Entity record Book(@Id long id, String isbn) > { ... } > > ? ? ?would desugar into > > ? ? ? ? ?class @Entity public class Book { private @Id > long id, private > String isbn; ... }; > > ? ? ?For the JPA entity use case it'd be helpful to > have an option to lift > annotations to the corresponding getters instead of > the fields (as the > location of the @Id annotation controls the default > strategy -- field vs. > property -- for reading/writing entity state). > Similarly, Bean Validation > would benefit from such option. > > My assumption has been that we would allow an annotation > on a record > parameter as long as it has *any of > *{FIELD,METHOD,PARAMETER} as target, > and that the annotation would be automatically propagated > to each > synthesized element it applies to. Does this sound about > right to everyone? > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | > kevinb at google.com > > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 10 21:07:36 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 10 Apr 2018 23:07:36 +0200 (CEST) Subject: Annos on records (was: Records -- Using them as JPA entities and validating them with Bean Validation) In-Reply-To: References: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> Message-ID: <350222201.925811.1523394456214.JavaMail.zimbra@u-pem.fr> Here is what i've done to support ElementType.MODULE a library that has to work with Java 8, adding a target type is usually compatible because the one that add the annotation target is often the one in control of the code that will also consume the annotation. In order to work you need to answer two questions: - how to create an annotation compatible 8 with a meta-annotation value only available in 9. using ASM to add the right value to the annotation meta-annotation is a 10 lines program, - how to consume a non existing meta-annotation value, i do a switch on the name of the enum instead of doing a switch on the enum itself. R?mi > De: "Kevin Bourrillion" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 10 Avril 2018 22:42:24 > Objet: Re: Annos on records (was: Records -- Using them as JPA entities and > validating them with Bean Validation) > If we create a new ElementType.RECORD, the annotation in question won't even be > able to add that target type until it is ready to require JDK 13 (or whatever) > as its new minimum version. > On Tue, Apr 10, 2018 at 1:38 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> [ moving to amber-spec-experts] >> I tend to agree. It will take longer to adopt, but it _is_ a new kind of target >> in a source file, and then frameworks can decide what it should mean, and then >> there's no confusion. >> It's possible, too, as a migration move, to split the difference, though I'm not >> sure its worth it -- add a new target, _and_, if the target includes >> param/field/method, but does _not_ include record, then lower the anno onto all >> applicable members. >> On 4/10/2018 1:34 PM, Remi Forax wrote: >>> No, not right for me, >>> a new Annotation target is better so each framework can decide what it means for >>> its annotation. >>> It will slow the adoption but it's better in the long term. >>> R?mi >>> ----- Mail original ----- >>>> De: "Kevin Bourrillion" < [ mailto:kevinb at google.com | kevinb at google.com ] > >>>> ?: "Gunnar Morling" < [ mailto:gunnar at hibernate.org | gunnar at hibernate.org ] > >>>> Cc: "amber-dev" < [ mailto:amber-dev at openjdk.java.net | >>>> amber-dev at openjdk.java.net ] > >>>> Envoy?: Mardi 10 Avril 2018 19:25:57 >>>> Objet: Re: Records -- Using them as JPA entities and validating them with Bean >>>> Validation >>>> On Mon, Apr 9, 2018 at 1:39 PM, Gunnar Morling < [ mailto:gunnar at hibernate.org | >>>> gunnar at hibernate.org ] > wrote: >>>>> * Annotation semantics: I couldn't find any example of records with >>>>> annotations, but IIUC, something like >>>>> @Entity record Book(@Id long id, String isbn) { ... } >>>>> would desugar into >>>>> class @Entity public class Book { private @Id long id, private >>>>> String isbn; ... }; >>>>> For the JPA entity use case it'd be helpful to have an option to lift >>>>> annotations to the corresponding getters instead of the fields (as the >>>>> location of the @Id annotation controls the default strategy -- field vs. >>>>> property -- for reading/writing entity state). Similarly, Bean Validation >>>>> would benefit from such option. >>>> My assumption has been that we would allow an annotation on a record >>>> parameter as long as it has *any of *{FIELD,METHOD,PARAMETER} as target, >>>> and that the annotation would be automatically propagated to each >>>> synthesized element it applies to. Does this sound about right to everyone? >>>> -- >>>> Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | >>>> kevinb at google.com ] > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 10 21:18:14 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 10 Apr 2018 17:18:14 -0400 Subject: Switch on java.lang.Class In-Reply-To: References: <3510133d-4147-fab9-366f-6ea42b523c4b@oracle.com> Message-ID: <9cb0d60e-18ae-b06d-915a-c29a80b823cf@oracle.com> Also, let's separate the problem from the solution here. Problem: switching on class values. Solution: make class literals be constant patterns. I don't think the problem is unworthy of solution, but I don't like the specific solution.? But, there may be other ways to get there. Here's two workarounds that you can do today: ??? switch (c.getName()) { ??????? case "java.lang.String": ... ??????? case "java.lang.Integer": ... ??? } ??? enum KnownTypes { ??????? STRING(String.class), INTEGER(Integer.class), ...; ? ?? ?? static Map classToEnum = new HashMap<>(); ??????? ... constructor populates map ... ??? } ??? switch (KnownTypes.classToEnum.get(c)) { ??????? case null: ... ??????? case STRING: ... ??????? case INTEGER: ... ??? } These are both workarounds, for sure.? (With map literals, we can make either cleaner.) What other approaches might there be?? Well, I don't want to open that discussion now, but clearly, at some point, we'll have the ability to declare explicit patterns.? This opens doors to writing your own patterns that let you switch on arbitrary inputs: ??? switch (c) { ??????? case isStringClass(): ... ??????? case isIntegerClass(): ... ??? } There's a long way to go to get there, and lots of ways to slice this, but I think, if this problem is worth solving, there are other candidate solutions that don't have the confusion downside. On 4/10/2018 1:12 AM, Tagir Valeev wrote: > Hello! > > Does not sound convincing. First, to my experience, it's quite widely > applicable. My first two samples were from what you called a low-level > libraries just because I wanted to grep something well-known. Now I > grepped jdk10 source by `\w+ == \w+\.class` and scanned manually about > 10% of the results and found about 10 places where it's useful (some > examples are shown below). So extrapolating I may assume that this > construct can be applied roughly 100 times in JDK (note that my regexp > does not cover xyz.equals(foo.class) and some developers prefer this > style; also different spacing is not covered). You may surely call the > JDK code as "low-level libraries", but grepping IntelliJ IDEA source > code I also see significant amount of occurrences. Though I don't see > why usefulness of the feature in a low-level libraries should be the > warning sign. > > In any case I'm pretty sure that switch on class will be more > applicable, than the switch on floats. But you are doing theswitch on > floats. Why? For consistency, of course. You want to support all > literals in switch. But class literals are also literals, according to > JLS 15.8.2, so it is inconsistent not to support them (especially > taking into account that their usefulness is not the lowest of all > possible literals). Another comparison: all literals (including class > literals) and enum values are acceptable as annotation values. The > same in switch expressions, but excluding the class literals, which is > inconsistent. > > I don't buy an error-prone argument either. Is `switch(doubleValue) > {case Math.PI: ...}` error-prone? Why somebody cannot assume that the > comparison should tolerate some delta difference between doubleValue > and Math.PI? Somebody surely can, but that's silly. We know that the > switch checks for equality, it was always so. It will be so for > classes as well, and assuming something different is inconsistent. > After all, writing foo.equals(Bar.class) or foo == Bar.class is > allowed in the language, people use these constructions, and often > it's the right thing to do. Of course their code becomes erroneous > sometimes, because in this particular place the inheritance should be > taken into account. But the same is true for doubleValue == Math.PI > comparison: sometimes it's ok, sometimes it's wrong and some tolerance > interval should be checked instead. And when it's ok, you add a new > option to use switch on doubles. > > Several code samples found in JDK: > > 1.?javafx.base/javafx/util/converter/LocalDateTimeStringConverter.java:197 > (final classes) > if (type == LocalDate.class) { > ? return (T)LocalDate.from(chronology.date(temporal)); > } else if (type == LocalTime.class) { > ? return (T)LocalTime.from(temporal); > } else { > ? return (T)LocalDateTime.from(chronology.localDateTime(temporal)); > } > > 2.?java.desktop/sun/print/Win32PrintService.java:928 (final classes) > ? ? ? ? if (category == ColorSupported.class) { > ? ? ? ? ? ? int caps = getPrinterCapabilities(); > ? ? ? ? ? ? if ((caps & DEVCAP_COLOR) != 0) { > ? ? ? ? ? ? ? ? return (T)ColorSupported.SUPPORTED; > ? ? ? ? ? ? } else { > ? ? ? ? ? ? ? ? return (T)ColorSupported.NOT_SUPPORTED; > ? ? ? ? ? ? } > ? ? ? ? } else if (category == PrinterName.class) { > ? ? ? ? ? ? return (T)getPrinterName(); > ? ? ? ? } else if (category == PrinterState.class) { > ? ? ? ? ? ? return (T)getPrinterState(); > ? ? ? ? } else if (category == PrinterStateReasons.class) { > ? ? ? ? ? ? return (T)getPrinterStateReasons(); > ? ? ? ? } else if (category == QueuedJobCount.class) { > ? ? ? ? ? ? return (T)getQueuedJobCount(); > ? ? ? ? } else if (category == PrinterIsAcceptingJobs.class) { > ? ? ? ? ? ? return (T)getPrinterIsAcceptingJobs(); > ? ? ? ? } else { > ? ? ? ? ? ? return null; > ? ? ? ? } > 3.?com.sun.media.sound.SoftSynthesizer#getPropertyInfo (line 926) - > final classes; several blocks like this > ? ? ? ? ? ? ? ? ? ? ? ? ? ? if (c == Byte.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Byte.valueOf(s); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? else if (c == Short.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Short.valueOf(s); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? else if (c == Integer.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Integer.valueOf(s); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? else if (c == Long.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Long.valueOf(s); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? else if (c == Float.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Float.valueOf(s); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? else if (c == Double.class) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? item2.value = Double.valueOf(s); > 4.?java.awt.Component#getListeners (interfaces!), Window#getListeners, > List#getListeners, JComponent#getListeners, etc. are similar > ? ? public T[] getListeners(Class > listenerType) { > ? ? ? ? EventListener l = null; > ? ? ? ? if? (listenerType == ComponentListener.class) { > ? ? ? ? ? ? l = componentListener; > ? ? ? ? } else if (listenerType == FocusListener.class) { > ? ? ? ? ? ? l = focusListener; > ? ? ? ? } else if (listenerType == HierarchyListener.class) { > ? ? ? ? ? ? l = hierarchyListener; > ? ? ? ? } else if (listenerType == HierarchyBoundsListener.class) { > ? ? ? ? ? ? l = hierarchyBoundsListener; > ? ? ? ? } else if (listenerType == KeyListener.class) { > ? ? ? ? ? ? l = keyListener; > ? ? ? ? } else if (listenerType == MouseListener.class) { > ? ? ? ? ? ? l = mouseListener; > ? ? ? ? } else if (listenerType == MouseMotionListener.class) { > ? ? ? ? ? ? l = mouseMotionListener; > ? ? ? ? } else if (listenerType == MouseWheelListener.class) { > ? ? ? ? ? ? l = mouseWheelListener; > ? ? ? ? } else if (listenerType == InputMethodListener.class) { > ? ? ? ? ? ? l = inputMethodListener; > ? ? ? ? } else if (listenerType == PropertyChangeListener.class) { > ? ? ? ? ? ? return (T[])getPropertyChangeListeners(); > ? ? ? ? } > ? ? ? ? return AWTEventMulticaster.getListeners(l, listenerType); > ? ? } > 5.?java.beans.XMLEncoder#primitiveTypeFor (final classes) > ? ? ? ?if (wrapper == Boolean.class) return Boolean.TYPE; > ? ? ? ? if (wrapper == Byte.class) return Byte.TYPE; > ? ? ? ? if (wrapper == Character.class) return Character.TYPE; > ? ? ? ? if (wrapper == Short.class) return Short.TYPE; > ? ? ? ? if (wrapper == Integer.class) return Integer.TYPE; > ? ? ? ? if (wrapper == Long.class) return Long.TYPE; > ? ? ? ? if (wrapper == Float.class) return Float.TYPE; > ? ? ? ? if (wrapper == Double.class) return Double.TYPE; > ? ? ? ? if (wrapper == Void.class) return Void.TYPE; > ? ? ? ? return null; > ?6.?javax.swing.plaf.synth.SynthTableUI.SynthTableCellRenderer#configureValue > (mix of abstract, non-final and final classes) > ? ? ? ? private void configureValue(Object value, Class columnClass) { > ? ? ? ? ? ? if (columnClass == Object.class || columnClass == null) { > // case Object.class, null! > ? ? ? ? ? ? ? ? setHorizontalAlignment(JLabel.LEADING); > ? ? ? ? ? ? } else if (columnClass == Float.class || columnClass == > Double.class) { > ? ? ? ? ? ? ? ? if (numberFormat == null) { > ? ? ? ? ? ? ? ? ? ? numberFormat = NumberFormat.getInstance(); > ? ? ? ? ? ? ? ? } > ? ? ? ? ? ? ? ? setHorizontalAlignment(JLabel.TRAILING); > ? ? ? ? ? ? ? ? setText((value == null) ? "" : > ((NumberFormat)numberFormat).format(value)); > ? ? ? ? ? ? } > ? ? ? ? ? ? else if (columnClass == Number.class) { > ? ? ? ? ? ? ? ? setHorizontalAlignment(JLabel.TRAILING); > ? ? ? ? ? ? ? ? // Super will have set value. > ? ? ? ? ? ? } > ? ? ? ? ? ? else if (columnClass == Icon.class || columnClass == > ImageIcon.class) { > ? ? ? ? ? ? ? ? setHorizontalAlignment(JLabel.CENTER); > ? ? ? ? ? ? ? ? setIcon((value instanceof Icon) ? (Icon)value : null); > ? ? ? ? ? ? ? ? setText(""); > ? ? ? ? ? ? } > ? ? ? ? ? ? else if (columnClass == Date.class) { > ? ? ? ? ? ? ? ? if (dateFormat == null) { > ? ? ? ? ? ? ? ? ? ? dateFormat = DateFormat.getDateInstance(); > ? ? ? ? ? ? ? ? } > ? ? ? ? ? ? ? ? setHorizontalAlignment(JLabel.LEADING); > ? ? ? ? ? ? ? ? setText((value == null) ? "" : > ((Format)dateFormat).format(value)); > ? ? ? ? ? ? } > ? ? ? ? ? ? else { > ? ? ? ? ? ? ? ? configureValue(value, columnClass.getSuperclass()); // > note this: recursively going to superclass automatically > ? ? ? ? ? ? } > ? ? ? ? } > > With best regards, > Tagir Valeev. > > > On Mon, Apr 9, 2018 at 8:38 PM, Brian Goetz > wrote: > > I'm skeptical of this feature, because (a) its not as widely > applicable as it looks, (b) its error-prone. > > Both of these stem from the fact that comparing classes with == > excludes subtypes.? So it really only works with final classes -- > but if we had a feature like this, people might mistakenly use it > with nonfinal classes, and be surprised when a subtype shows up > (this can happen even when your IDE tells you there are no > subtypes, because of dynamic proxies).? And all of the examples > you show are in low-level libraries, which is a warning sign. > > Where did these snippets get their Class from?? Good chance, case > 1 got it from calling Object.getClass().? In which case, they can > just pattern match on the type of the thing: > > ??? switch (date) { > ??????? case Date d: ... > ??????? case Timestamp t: ... > ??????? default: ... > ??? } > > Case 2 is more likely just operating on types that it got from a > reflection API.? If you have only a few entries, an if-else will > do; if you have more entries, a Map is likely to be the better > choice.? For situations like this, I'd rather invest in map > literals or better Map.of() builders. > > So, I would worry this feature is unlikely to carry its weight, > and further, may lead to misuse. > > > > On 4/9/2018 1:07 AM, Tagir Valeev wrote: > > Hello! > > I don't remember whether switch on java.lang.Class instance > was discussed. I guess, this pattern is quite common and it > will be useful to support it. Such code often appears in > deserialization logic when we branch on desired type to > deserialize. Here are a couple of examples from opensource > libraries: > > 1. com.google.gson.DefaultDateTypeAdapter#read (gson-2.8.2): > > ? ? Date date = deserializeToDate(in.nextString()); > ? ? if (dateType == Date.class) { > ? ? ? return date; > ? ? } else if (dateType == Timestamp.class) { > ? ? ? return new Timestamp(date.getTime()); > ? ? } else if (dateType == java.sql.Date.class) { > ? ? ? return new java.sql.Date(date.getTime()); > ? ? } else { > ? ? ? // This must never happen: dateType is guarded in the > primary constructor > ? ? ? throw new AssertionError(); > ? ? } > > Could be rewritten as: > > ? ? Date date = deserializeToDate(in.nextString()); > ? ? return switch(dateType) { > ? ? ? case Date.class -> date; > ? ? ? case Timestamp.class -> new Timestamp(date.getTime()); > ? ? ? case java.sql.Date.class -> new > java.sql.Date(date.getTime()); > ? ? ? default -> > ? ? ? ? // This must never happen: dateType is guarded in the > primary constructor > ? ? ? ? throw new AssertionError(); > ? ? }; > > 2. > com.fasterxml.jackson.databind.deser.std.FromStringDeserializer#findDeserializer > (jackson-databind-2.9.4): > > ? ? public static Std findDeserializer(Class rawType) > ? ? { > ? ? ? ? int kind = 0; > ? ? ? ? if (rawType == File.class) { > ? ? ? ? ? ? kind = Std.STD_FILE; > ? ? ? ? } else if (rawType == URL.class) { > ? ? ? ? ? ? kind = Std.STD_URL; > ? ? ? ? } else if (rawType == URI.class) { > ? ? ? ? ? ? kind = Std.STD_URI; > ? ? ? ? } else if (rawType == Class.class) { > ? ? ? ? ? ? kind = Std.STD_CLASS; > ? ? ? ? } else if (rawType == JavaType.class) { > ? ? ? ? ? ? kind = Std.STD_JAVA_TYPE; > ? ? ? ? } else if // more branches like this > ? ? ? ? } else { > ? ? ? ? ? ? return null; > ? ? ? ? } > ? ? ? ? return new Std(rawType, kind); > ? ? } > > Could be rewritten as: > > ? ? public static Std findDeserializer(Class rawType) > ? ? { > ? ? ? ? int kind = switch(rawType) { > ? ? ? ? case File.class -> Std.STD_FILE; > ? ? ? ? case URL.class -> Std.STD_URL; > ? ? ? ? case URI.class -> Std.STD_URI; > ? ? ? ? case Class.cass -> Std.STD_CLASS; > ? ? ? ? case JavaType.class -> Std.STD_JAVA_TYPE; > ? ? ? ? ... > ? ? ? ? default -> 0; > ? ? ? ? }; > ? ? ? ? return kind == 0 ? null : new Std(rawType, kind); > ? ? } > > In such code all branches are mutually exclusive. The > bootstrap method can generate a lookupswitch based on > Class.hashCode, then equals checks, pretty similar to String > switch implementation. Unlike String hash codes Class.hashCode > is not stable and varies between JVM launches, but they are > already known during the bootstrap and we can trust them > during the VM lifetime, so we can generate a lookupswitch. The > minor problematic point is to support primitive classes like > int.class. This cannot be passed directly as indy static > argument, but this can be solved with condy. > > What do you think? > > With best regards, > Tagir Valeev. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 10 21:19:16 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 10 Apr 2018 17:19:16 -0400 Subject: Annos on records In-Reply-To: <350222201.925811.1523394456214.JavaMail.zimbra@u-pem.fr> References: <606812520.898989.1523381675460.JavaMail.zimbra@u-pem.fr> <350222201.925811.1523394456214.JavaMail.zimbra@u-pem.fr> Message-ID: <6938fa4e-6f08-865b-88f4-fbf31ff70e2d@oracle.com> And MR-Jar obviates using ASM to add the right value to the meta-annotation; you just have two sources, one for 13 and one for prior. On 4/10/2018 5:07 PM, Remi Forax wrote: > Here is what i've done to support ElementType.MODULE a library that > has to work with Java 8, > adding a target type is usually compatible because the one that add > the annotation target is often the one in control of the code that > will also consume the annotation. > > In order to work you need to answer two questions: > ? - how to create an annotation compatible 8 with a meta-annotation > value only available in 9. > ??? using ASM to add the right value to the annotation meta-annotation > is a 10 lines program, > ?- how to consume a non existing meta-annotation value, > ?? i do a switch on the name of the enum instead of doing a switch on > the enum itself. > > R?mi > > ------------------------------------------------------------------------ > > *De: *"Kevin Bourrillion" > *?: *"Brian Goetz" > *Cc: *"amber-spec-experts" > *Envoy?: *Mardi 10 Avril 2018 22:42:24 > *Objet: *Re: Annos on records (was: Records -- Using them as JPA > entities and validating them with Bean Validation) > > If we create a new ElementType.RECORD, the annotation in question > won't even be /able /to add that target type until it is ready to > /require/?JDK 13 (or whatever) as its new minimum version. > > > On Tue, Apr 10, 2018 at 1:38 PM, Brian Goetz > > wrote: > > [ moving to amber-spec-experts] > > I tend to agree.? It will take longer to adopt, but it _is_ a > new kind of target in a source file, and then frameworks can > decide what it should mean, and then there's no confusion. > > It's possible, too, as a migration move, to split the > difference, though I'm not sure its worth it -- add a new > target, _and_, if the target includes param/field/method, but > does _not_ include record, then lower the anno onto all > applicable members. > > On 4/10/2018 1:34 PM, Remi Forax wrote: > > No, not right for me, > a new Annotation target is better so each framework can > decide what it means for its annotation. > > It will slow the adoption but it's better in the long term. > > R?mi > > ----- Mail original ----- > > De: "Kevin Bourrillion" > > ?: "Gunnar Morling" > > Cc: "amber-dev" > > Envoy?: Mardi 10 Avril 2018 19:25:57 > Objet: Re: Records -- Using them as JPA entities and > validating them with Bean Validation > On Mon, Apr 9, 2018 at 1:39 PM, Gunnar Morling > > > wrote: > > ? ?* Annotation semantics: I couldn't find any > example of records with > annotations, but IIUC, something like > > ? ? ? ? ?@Entity record Book(@Id long id, String > isbn) { ... } > > ? ? ?would desugar into > > ? ? ? ? ?class @Entity public class Book { private > @Id long id, private > String isbn; ... }; > > ? ? ?For the JPA entity use case it'd be helpful > to have an option to lift > annotations to the corresponding getters instead > of the fields (as the > location of the @Id annotation controls the > default strategy -- field vs. > property -- for reading/writing entity state). > Similarly, Bean Validation > would benefit from such option. > > My assumption has been that we would allow an > annotation on a record > parameter as long as it has *any of > *{FIELD,METHOD,PARAMETER} as target, > and that the annotation would be automatically > propagated to each > synthesized element it applies to. Does this sound > about right to everyone? > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | > kevinb at google.com > > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, > Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Tue Apr 10 22:20:01 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 10 Apr 2018 16:20:01 -0600 Subject: Switch expressions -- gathering the threads In-Reply-To: <1288016983.922021.1523392253596.JavaMail.zimbra@u-pem.fr> References: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> <55F4951B-277D-4AD9-A96E-DE36406C6ACB@oracle.com> <1288016983.922021.1523392253596.JavaMail.zimbra@u-pem.fr> Message-ID: <324360E8-22AC-4947-8D3F-6D364436CA0A@oracle.com> > On Apr 10, 2018, at 2:30 PM, Remi Forax wrote: > > I'm not sure this difference is important. > > What about the example below, multiple labels or a fallthrough ? > switch(x) { > case 0: > ; > case 1: > } My request is to call this an example of fallthrough. I think you're trying to make a point that some forms of switches with fallthrough behave the same as switches with multiple labels. Sure, that's fine. I still think it's helpful to talk about the two cases separately, as distinct features, because the practical use cases are very different. ?Dan From forax at univ-mlv.fr Tue Apr 10 22:47:45 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 11 Apr 2018 00:47:45 +0200 (CEST) Subject: Switch expressions -- gathering the threads In-Reply-To: <324360E8-22AC-4947-8D3F-6D364436CA0A@oracle.com> References: <403596bb-406b-6b99-1dd5-420f7bea5dfa@oracle.com> <55F4951B-277D-4AD9-A96E-DE36406C6ACB@oracle.com> <1288016983.922021.1523392253596.JavaMail.zimbra@u-pem.fr> <324360E8-22AC-4947-8D3F-6D364436CA0A@oracle.com> Message-ID: <834610449.936839.1523400465859.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "daniel smith" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Mercredi 11 Avril 2018 00:20:01 > Objet: Re: Switch expressions -- gathering the threads >> On Apr 10, 2018, at 2:30 PM, Remi Forax wrote: >> >> I'm not sure this difference is important. >> >> What about the example below, multiple labels or a fallthrough ? >> switch(x) { >> case 0: >> ; >> case 1: >> } > > My request is to call this an example of fallthrough. > > I think you're trying to make a point that some forms of switches with > fallthrough behave the same as switches with multiple labels. Sure, that's > fine. I still think it's helpful to talk about the two cases separately, as > distinct features, because the practical use cases are very different. I think is see all forms as being fallthrough and what you call a multiple labels form as the result after a peephole optimization, i.e if there is no instruction between the two cases, then the compiler will make them share the same label. > > ?Dan R?mi From gavin.bierman at oracle.com Thu Apr 12 21:27:06 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 12 Apr 2018 22:27:06 +0100 Subject: JEP325: Switch expressions spec Message-ID: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> I have uploaded a draft spec for JEP 325: Switch expressions at http://cr.openjdk.java.net/~gbierman/switch-expressions.html Note there are still three things missing: * There is no text about typing a switch expression, as this is still being discussed on this list. * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. Comments welcomed! Gavin From maurizio.cimadamore at oracle.com Fri Apr 13 11:15:01 2018 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 13 Apr 2018 12:15:01 +0100 Subject: JEP325: Switch expressions spec In-Reply-To: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <973dfd0e-034c-246b-ecf9-3c50a7f6d501@oracle.com> Looks neat. Some comments: * I note that you introduced patterns to describe the new syntactic options; while that's a completely fine choice, I wonder if it could lead to confusion - I always thought of JEP 325 as a set of standalone switch improvements, which don't need the P-word to be justified. Of course, I'm not opposed to what you have done, just noting (aloud) the mismatch with my expectations. * in 14.11 I find these sentences: "then we say that the null pattern matches", "then we say that the pattern matches" A bit odd to read , as the transitive verb 'matches' is missing its object. * also I note some replication: "If all these statements complete normally, or if there are no statements after the pattern label containing the matching pattern, then the entire switch statement completes normally." "If all these statements complete normally, or if there are no statements after the pattern label containing the matching pattern, then the entire switch statement completes normally." "If all these statements complete normally, or if there are no statements after the default pattern label, then the entire switch statement completes normally." The first two are identical, the last only slightly different, perhaps something can be done to consolidate * "A break statement either transfers control out of an enclosing statement or returns a value to an immediately enclosing switch expression." Is it an either/or? My mental model is that break always transfer controls out - it can do so with a value, or w/o a value (as in a classic break). * I like the fact that you define the semantics of the expression switch clauses in terms of desugaring to statements blocks - this is consistent with what we do in other areas (enhanced for loop, try with resources). * I suggest putting the paragraph in 15.29 starting with: "Given a switch expression, all of the following must be true" Ahead of the desugaring paragraph, which seems more execution/semantics-related, while this one is still about well-formedness. * On totality - this line: default???????????????????? -> 10; // Legal deserves some more explanation - e.g. one might think it's unreachable, but it's not because new constants could pop up at runtime; maybe add a clarification. * On non-returning, this sentence is obscure: "Thus a switch expression block that can not complete normally, can only do so by occurrences of a break statement with an Expression. This ensures that a switch expression must either result in a value, or complete abruptly." because it contradicts what is said just a line above: "an occurrence of a break statement with an Expression in a switch expression means that the switch expression will complete normally with the the value of the Expression" The way I read this is: 1) the only way for the block after a 'case' in a switch pattern to complete abnormally is via a break expression 2) even if the _block_ completes abnormally, the containing switch expression will complete normally, with the value of Expression Is that what you meant? * At the end of the switch expression section there are sub-optimal sentences like the one that appear for switch statements (e.g. "pattern matches") - see above. Cheers Maurizio On 12/04/18 22:27, Gavin Bierman wrote: > I have uploaded a draft spec for JEP 325: Switch expressions at http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > Note there are still three things missing: > > * There is no text about typing a switch expression, as this is still being discussed on this list. > * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. > * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. > > Comments welcomed! > Gavin From brian.goetz at oracle.com Fri Apr 13 16:46:39 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 13 Apr 2018 12:46:39 -0400 Subject: [records] Ancillary fields (was: Records -- current status) In-Reply-To: References: Message-ID: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Let's see if we can make some progress on the elephant in the room -- ancillary fields.? Several have expressed the concern that without the ability to declare some additional instance state, the feature will be too limited. The argument in favor of additional fields is the obvious one; more classes can be records.? And there are some arguably valid use cases for additional fields that don't conflict with the design center for records.? The best example is derived state: ?- When a field is a cached property derived from the record state (such as how String caches its hashCode) Arguably, if a field is derived deterministically from immutable record state, then it is not creating any new record state.? This surely seems within the circle. The argument against is more of a slippery-slope one; I believe developers would like to view this feature through the lens of syntactic boilerplate, rather than through semantics.? If we let them, they would surely and routinely do the following: ??? record A(int a, int b) { ??????? private int c; ??????? public A(int a, int b, int c) { ??????????? this(a, b); ??????????? this.c = c; ??????? } ??????? public boolean equals(Object other) { ??????????? return default.equals(other) && ((A) other).c == c; ??????? } ??? } Here, `c` is surely part of the state of `A`.? And, they wouldn't even know what they'd lost; they would just assume records are a way of "kickstarting" a class declaration with some public fields, and then you can mix in whatever private state you want. Why is this bad?? While "reduced-boilerplate classes" is a valid feature idea, our design goal for records is much more than that. The semantic constraints on records are valuable because they yield useful invariants; that they are "just" their state vector, that they can be freely taken apart and put back together with no loss of information, and hence can be freely serialized/marshaled to JSON and back, etc. We currently prohibit records like `A` via a number of restrictions: no additional fields, no override of equals.? We don't need all of these restrictions to achieve the desired goal, but we also can't relax them all without opening the gate.? So we should decide carefully which we want to relax, as making the wrong choice constrains us in the future. Before I dive into details of how we might extend records to support the case of "cached derived state", I'd like to first come to some agreement that this covers the use cases that we think fall into the "legitimate" uses of additional fields. On 3/16/2018 2:55 PM, Brian Goetz wrote: > There are a number of potentially open details on the design for > records.? My inclination is to start with the simplest thing that > preserves the flexibility and expectations we want, and consider > opening up later as necessary. > > One of the biggest issues, which Kevin raised as a must-address issue, > is having sufficient support for precondition validation. Without > foreclosing on the ability to do more later with declarative guards, I > think the recent construction proposal meets the requirement for > lightweight enforcement with minimal or no duplication.? I'm hopeful > that this bit is "there". > > Our goal all along has been to define records as being ?just macros? > for a finer-grained set of features.? Some of these are motivated by > boilerplate; some are motivated by semantics (coupling semantics of > API elements to state.)? In general, records will get there first, and > then ordinary classes will get the more general feature, but the > default answer for "can you relax records, so I can use it in this > case that almost but doesn't quite fit" should be "no, but there will > probably be a feature coming that makes that class simpler, wait for > that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html for > reference), and my current thoughts on these, are outlined below. > Comments welcome! > > ?- Extension.? The proposal outlines a notion of abstract record, > which provides a "width subtyped" hierarchy.? Some have questioned > whether this carries its weight, especially given how Scala doesn't > support case-to-case extension (some see this as a bug, others as an > existence proof.)? Records can implement interfaces. > > ?- Concrete records are final.? Relaxing this adds complexity to the > equality story; I'm not seeing good reasons to do so. > > ?- Additional constructors.? I don't see any reason why additional > constructors are problematic, especially if they are constrained to > delegate to the default constructor (which in turn is made far simpler > if there can be statements ahead of the this() call.) Users may find > the lack of additional constructors to be an arbitrary limitation (and > they'd probably be right.) > > ?- Static fields.? Static fields seem harmless. > > ?- Additional instance fields.? These are a much bigger concern. While > the primary arguments against them are of the "slippery slope" > variety, I still have deep misgivings about supporting unrestricted > non-principal instance fields, and I also haven't found a reasonable > set of restrictions that makes this less risky.? I'd like to keep > looking for a better story here, before just caving on this, as I > worry doing so will end up biting us in the back. > > ?- Mutability and accessibility.? I'd like to propose an odd choice > here, which is: fields are final and package (protected for abstract > records) by default, but finality can be explicitly opted out of > (non-final) and accessibility can be explicitly widened (public). > > ?- Accessors.? Perhaps the most controversial aspect is that records > are inherently transparent to read; if something wants to truly > encapsulate state, it's not a record.? Records will eventually have > pattern deconstructors, which will expose their state, so we should go > out of the gate with the equivalent.? The obvious choice is to expose > read accessors automatically.? (These will not be named getXxx; we are > not burning the ill-advised Javabean naming conventions into the > language, no matter how much people think it already is.)? The obvious > naming choice for these accessors is fieldName().? No provision for > write accessors; that's bring-your-own. > > ?- Core methods.? Records will get equals, hashCode, and toString.? > There's a good argument for making equals/hashCode final (so they > can't be explicitly redeclared); this gives us stronger preservation > of the data invariants that allow us to safely and mechanically > snapshot / serialize / marshal (we'd definitely want this if we ever > allowed additional instance fields.)? No reason to suppress override > of toString, though. Records could be safely made cloneable() with > automatic support too (like arrays), but not clear if this is worth it > (its darn useful for arrays, though.)? I think the auto-generated > getters should be final too; this leaves arrays as second-class > components, but I am not sure that bothers me. > > > > > From kevinb at google.com Fri Apr 13 17:15:47 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 13 Apr 2018 10:15:47 -0700 Subject: [records] Ancillary fields (was: Records -- current status) In-Reply-To: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Message-ID: As one of the voices demanding we allow ancillary fields, I can confirm that I had only these derived-state use cases in mind. I don't see anything else as legitimate. That is, I think that the semantic invariants you're trying to preserve for records are worth fighting for, and additional *non-derived* state would violate them. On Fri, Apr 13, 2018 at 9:46 AM, Brian Goetz wrote: > Let's see if we can make some progress on the elephant in the room -- > ancillary fields. Several have expressed the concern that without the > ability to declare some additional instance state, the feature will be too > limited. > > The argument in favor of additional fields is the obvious one; more > classes can be records. And there are some arguably valid use cases for > additional fields that don't conflict with the design center for records. > The best example is derived state: > > - When a field is a cached property derived from the record state (such > as how String caches its hashCode) > > Arguably, if a field is derived deterministically from immutable record > state, then it is not creating any new record state. This surely seems > within the circle. > > The argument against is more of a slippery-slope one; I believe developers > would like to view this feature through the lens of syntactic boilerplate, > rather than through semantics. If we let them, they would surely and > routinely do the following: > > record A(int a, int b) { > private int c; > > public A(int a, int b, int c) { > this(a, b); > this.c = c; > } > > public boolean equals(Object other) { > return default.equals(other) && ((A) other).c == c; > } > } > > Here, `c` is surely part of the state of `A`. And, they wouldn't even > know what they'd lost; they would just assume records are a way of > "kickstarting" a class declaration with some public fields, and then you > can mix in whatever private state you want. > > Why is this bad? While "reduced-boilerplate classes" is a valid feature > idea, our design goal for records is much more than that. The semantic > constraints on records are valuable because they yield useful invariants; > that they are "just" their state vector, that they can be freely taken > apart and put back together with no loss of information, and hence can be > freely serialized/marshaled to JSON and back, etc. > > We currently prohibit records like `A` via a number of restrictions: no > additional fields, no override of equals. We don't need all of these > restrictions to achieve the desired goal, but we also can't relax them all > without opening the gate. So we should decide carefully which we want to > relax, as making the wrong choice constrains us in the future. > > Before I dive into details of how we might extend records to support the > case of "cached derived state", I'd like to first come to some agreement > that this covers the use cases that we think fall into the "legitimate" > uses of additional fields. > > > > On 3/16/2018 2:55 PM, Brian Goetz wrote: > >> There are a number of potentially open details on the design for >> records. My inclination is to start with the simplest thing that preserves >> the flexibility and expectations we want, and consider opening up later as >> necessary. >> >> One of the biggest issues, which Kevin raised as a must-address issue, is >> having sufficient support for precondition validation. Without foreclosing >> on the ability to do more later with declarative guards, I think the recent >> construction proposal meets the requirement for lightweight enforcement >> with minimal or no duplication. I'm hopeful that this bit is "there". >> >> Our goal all along has been to define records as being ?just macros? for >> a finer-grained set of features. Some of these are motivated by >> boilerplate; some are motivated by semantics (coupling semantics of API >> elements to state.) In general, records will get there first, and then >> ordinary classes will get the more general feature, but the default answer >> for "can you relax records, so I can use it in this case that almost but >> doesn't quite fit" should be "no, but there will probably be a feature >> coming that makes that class simpler, wait for that." >> >> >> Some other open issues (please see my writeup at >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), >> and my current thoughts on these, are outlined below. Comments welcome! >> >> - Extension. The proposal outlines a notion of abstract record, which >> provides a "width subtyped" hierarchy. Some have questioned whether this >> carries its weight, especially given how Scala doesn't support case-to-case >> extension (some see this as a bug, others as an existence proof.) Records >> can implement interfaces. >> >> - Concrete records are final. Relaxing this adds complexity to the >> equality story; I'm not seeing good reasons to do so. >> >> - Additional constructors. I don't see any reason why additional >> constructors are problematic, especially if they are constrained to >> delegate to the default constructor (which in turn is made far simpler if >> there can be statements ahead of the this() call.) Users may find the lack >> of additional constructors to be an arbitrary limitation (and they'd >> probably be right.) >> >> - Static fields. Static fields seem harmless. >> >> - Additional instance fields. These are a much bigger concern. While >> the primary arguments against them are of the "slippery slope" variety, I >> still have deep misgivings about supporting unrestricted non-principal >> instance fields, and I also haven't found a reasonable set of restrictions >> that makes this less risky. I'd like to keep looking for a better story >> here, before just caving on this, as I worry doing so will end up biting us >> in the back. >> >> - Mutability and accessibility. I'd like to propose an odd choice here, >> which is: fields are final and package (protected for abstract records) by >> default, but finality can be explicitly opted out of (non-final) and >> accessibility can be explicitly widened (public). >> >> - Accessors. Perhaps the most controversial aspect is that records are >> inherently transparent to read; if something wants to truly encapsulate >> state, it's not a record. Records will eventually have pattern >> deconstructors, which will expose their state, so we should go out of the >> gate with the equivalent. The obvious choice is to expose read accessors >> automatically. (These will not be named getXxx; we are not burning the >> ill-advised Javabean naming conventions into the language, no matter how >> much people think it already is.) The obvious naming choice for these >> accessors is fieldName(). No provision for write accessors; that's >> bring-your-own. >> >> - Core methods. Records will get equals, hashCode, and toString. >> There's a good argument for making equals/hashCode final (so they can't be >> explicitly redeclared); this gives us stronger preservation of the data >> invariants that allow us to safely and mechanically snapshot / serialize / >> marshal (we'd definitely want this if we ever allowed additional instance >> fields.) No reason to suppress override of toString, though. Records could >> be safely made cloneable() with automatic support too (like arrays), but >> not clear if this is worth it (its darn useful for arrays, though.) I >> think the auto-generated getters should be final too; this leaves arrays as >> second-class components, but I am not sure that bothers me. >> >> >> >> >> >> > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 13 17:17:10 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 13 Apr 2018 13:17:10 -0400 Subject: [records] equals / hashCode (was: Records -- current status) In-Reply-To: References: Message-ID: <64e8fca3-ab81-e65e-85d5-26bd9f69eabd@oracle.com> Along the lines of the previous mail, people have and will ask "why can't I redefine equals/hashCode".? And the answer has two layers: ?- The constraints on equals/hashCode are stronger for records, and users might inadvertently violate them.? (They can be specified in the overrides of equals/hashCode in AbstractRecord, so there at least can be a place where this specification lives, even if no one reads it.) ?- In conjunction with ancillary fields, the constraints are sure to be violated, whether inadvertently and deliberately. Let's take a look at what sorts of modifications to equals/hashCode would be OK, should we decide to relax this restriction.? Equality should still derive from the record's state, but there might be acceptable variations. Would it be OK to _widen_ the definition of equality, by ignoring a component of the record? This is an example of what Gunnar asked for, which is to restrict equality to the primary key fields: ??? record PersonEntity(int primaryKey, String name, int age) { ??????? // equality based only on primaryKey ??? } Is this OK?? Well, let's look at our model: ?- Does ctor(dtor(c)) == c?? Yes. ?- if S1==S2, does ctor(S1) == ctor(S2)?? Yes. ?- For equal instances, does mutating them in the same way yield equal instances?? Yes. ?- For equal instances, does calling the same method on both with the same parameters yield equivalent results?? No. So, if p1 == p2, we cannot rely on p1.age() == p2.age(), so this fails the requirements of our pseudo-formal model.? (Assuming our model is the right one.) So, how would we feel about that?? Two records that are equals() to each other, but not substitable? A more subtle version of this would be to consider all components, but use a more inclusive notion of equality for that field, such as comparing array components by contents. ??? record Numbers(int[] numbers) { ??????? // equality based on Arrays.equals() ??? } ?- Does ctor(dtor(c)) == c?? Yes. ?- Do equal state vectors produce equal records?? Yes. ?- Do identical mutations on equal records produce equal records? Yes. ?- Does identical operations on equal records produce equal results?? Almost... The Almost qualification can be seen here: ??? int[] a1; ??? int[] a2 = copyOf(a1); ??? Numbers r1 = new Numbers(a1), r2 = new Numbers(a2); ??? boolean same = a1.numbers().equals(a2.numbers()) The accessor will yield up the array references, which will not be equals() to each other.? This is essentially the same problem as above. You get a similar result if your record represents something like a rational number and you don't normalize to lowest terms in the constructor; then you can have q1 equal q2, but q1.numerator() != q1.numerator(). Are any of these variations compelling enough to suggest we've got the wrong model? On 3/16/2018 2:55 PM, Brian Goetz wrote: > There are a number of potentially open details on the design for > records.? My inclination is to start with the simplest thing that > preserves the flexibility and expectations we want, and consider > opening up later as necessary. > > One of the biggest issues, which Kevin raised as a must-address issue, > is having sufficient support for precondition validation. Without > foreclosing on the ability to do more later with declarative guards, I > think the recent construction proposal meets the requirement for > lightweight enforcement with minimal or no duplication.? I'm hopeful > that this bit is "there". > > Our goal all along has been to define records as being ?just macros? > for a finer-grained set of features.? Some of these are motivated by > boilerplate; some are motivated by semantics (coupling semantics of > API elements to state.)? In general, records will get there first, and > then ordinary classes will get the more general feature, but the > default answer for "can you relax records, so I can use it in this > case that almost but doesn't quite fit" should be "no, but there will > probably be a feature coming that makes that class simpler, wait for > that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html for > reference), and my current thoughts on these, are outlined below. > Comments welcome! > > ?- Extension.? The proposal outlines a notion of abstract record, > which provides a "width subtyped" hierarchy.? Some have questioned > whether this carries its weight, especially given how Scala doesn't > support case-to-case extension (some see this as a bug, others as an > existence proof.)? Records can implement interfaces. > > ?- Concrete records are final.? Relaxing this adds complexity to the > equality story; I'm not seeing good reasons to do so. > > ?- Additional constructors.? I don't see any reason why additional > constructors are problematic, especially if they are constrained to > delegate to the default constructor (which in turn is made far simpler > if there can be statements ahead of the this() call.) Users may find > the lack of additional constructors to be an arbitrary limitation (and > they'd probably be right.) > > ?- Static fields.? Static fields seem harmless. > > ?- Additional instance fields.? These are a much bigger concern. While > the primary arguments against them are of the "slippery slope" > variety, I still have deep misgivings about supporting unrestricted > non-principal instance fields, and I also haven't found a reasonable > set of restrictions that makes this less risky.? I'd like to keep > looking for a better story here, before just caving on this, as I > worry doing so will end up biting us in the back. > > ?- Mutability and accessibility.? I'd like to propose an odd choice > here, which is: fields are final and package (protected for abstract > records) by default, but finality can be explicitly opted out of > (non-final) and accessibility can be explicitly widened (public). > > ?- Accessors.? Perhaps the most controversial aspect is that records > are inherently transparent to read; if something wants to truly > encapsulate state, it's not a record.? Records will eventually have > pattern deconstructors, which will expose their state, so we should go > out of the gate with the equivalent.? The obvious choice is to expose > read accessors automatically.? (These will not be named getXxx; we are > not burning the ill-advised Javabean naming conventions into the > language, no matter how much people think it already is.)? The obvious > naming choice for these accessors is fieldName().? No provision for > write accessors; that's bring-your-own. > > ?- Core methods.? Records will get equals, hashCode, and toString.? > There's a good argument for making equals/hashCode final (so they > can't be explicitly redeclared); this gives us stronger preservation > of the data invariants that allow us to safely and mechanically > snapshot / serialize / marshal (we'd definitely want this if we ever > allowed additional instance fields.)? No reason to suppress override > of toString, though. Records could be safely made cloneable() with > automatic support too (like arrays), but not clear if this is worth it > (its darn useful for arrays, though.)? I think the auto-generated > getters should be final too; this leaves arrays as second-class > components, but I am not sure that bothers me. > > > > > From forax at univ-mlv.fr Sat Apr 14 22:14:13 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 15 Apr 2018 00:14:13 +0200 (CEST) Subject: Record design (and ancillary fields) In-Reply-To: References: Message-ID: <469125311.2612492.1523744053717.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Daniel Latr?moli?re" > ?: "amber-spec-comments" > Envoy?: Samedi 14 Avril 2018 05:43:40 > Objet: Record design (and ancillary fields) > Isn't it possible to do for a record, like database design: interesting question > > - fields are, by default, read-write and not concerned by identity of > the row/instance. > > - one special field (primary key) has all constraints of the identity of > the row/instance. > > > For a record, that would signify that one field has to be marked > __Identity. It will be the only field used in equals/hashCode methods of > the record. > > For satisfying constraints of identity (immutability), this field would > be final and necessarily of a primitive type or value type (composite > primary key). Given a value type can be scalarized in the class, > restricting identity to only one field would not have real cost in instance. I do not think we have to do something specific for supporting relational database mapping, the tools that does this mapping already relies on annotation processor or bytecode agent to change the user code (at least to track the changes), so those tools can be updated to detect that a class is a record and provides the right equals/hashCode if those methods are not user defined. > > > Just my point of view, > > Daniel. > > > PS: Given primitive/value type disallow cyclical references, this will > prohibit StackOverflowException in equals/hashCode methods. only if an equals on a value type that contains an object doesn't call equals on that object. R?mi From brian.goetz at oracle.com Sat Apr 14 23:11:21 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 14 Apr 2018 19:11:21 -0400 Subject: Reader mail bag References: Message-ID: <6E0B3626-8CBB-47C5-80D5-C94040CACD40@oracle.com> This was received on the amber-spec-comments list. > Begin forwarded message: > > From: Daniel Latr?moli?re > Subject: Record design (and ancillary fields) > Date: April 13, 2018 at 11:43:40 PM EDT > To: amber-spec-comments at openjdk.java.net > > Isn't it possible to do for a record, like database design: > > - fields are, by default, read-write and not concerned by identity of the row/instance. > > - one special field (primary key) has all constraints of the identity of the row/instance. > > > For a record, that would signify that one field has to be marked __Identity. It will be the only field used in equals/hashCode methods of the record. > > For satisfying constraints of identity (immutability), this field would be final and necessarily of a primitive type or value type (composite primary key). Given a value type can be scalarized in the class, restricting identity to only one field would not have real cost in instance. > > > Just my point of view, > > Daniel. > > > PS: Given primitive/value type disallow cyclical references, this will prohibit StackOverflowException in equals/hashCode methods. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Mon Apr 16 16:53:17 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Mon, 16 Apr 2018 17:53:17 +0100 Subject: JEP325: Switch expressions spec In-Reply-To: <973dfd0e-034c-246b-ecf9-3c50a7f6d501@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <973dfd0e-034c-246b-ecf9-3c50a7f6d501@oracle.com> Message-ID: Thanks Maurizio. Some replies inline. > On 13 Apr 2018, at 12:15, Maurizio Cimadamore wrote: > > Looks neat. Some comments: > > * I note that you introduced patterns to describe the new syntactic options; while that's a completely fine choice, I wonder if it could lead to confusion - I always thought of JEP 325 as a set of standalone switch improvements, which don't need the P-word to be justified. Of course, I'm not opposed to what you have done, just noting (aloud) the mismatch with my expectations. Yes, you spotting me setting things up for a future release :-) But in my defence: in the current spec, we say ?case constant? where constant is either a constant expression or an enum constant. We are adding to this the possibility of a ?null?, so we need to find another word anyhow. That said, I think you have a point, so I?ll look again to see if I can dial it down a bit. > > * in 14.11 I find these sentences: > > "then we say that the null pattern matches", "then we say that the pattern matches" > > A bit odd to read , as the transitive verb 'matches' is missing its object. I know what you mean, but the spec today already states ?...then we say that the case label *matches*.? So I actually kept that text as it is. > > * also I note some replication: > > "If all these statements complete normally, or if there are no statements after the pattern label containing the matching pattern, then the entire switch statement completes normally." > "If all these statements complete normally, or if there are no statements after the pattern label containing the matching pattern, then the entire switch statement completes normally." > "If all these statements complete normally, or if there are no statements after the default pattern label, then the entire switch statement completes normally." > > The first two are identical, the last only slightly different, perhaps something can be done to consolidate I?ll take another look. > > * "A break statement either transfers control out of an enclosing statement or returns a value to an immediately enclosing switch expression." > > Is it an either/or? My mental model is that break always transfer controls out - it can do so with a value, or w/o a value (as in a classic break). This is a good question, although probably only one for spec-nerds. The problem is that the concept of ?transfer of control? is only valid for statements - in essence you jump from one statement to the other. There is no concept in the JLS of control for *expressions*. So you can?t really say that the break statement with a value transfers control to an *expression*. This is what is so ?unusual? about switch expressions, they are expressions with statements inside... This either/or distinction makes clear, for better or for worse, the new dual nature of break statements: they either transfer control to another statement, or they end up returning a value to an enclosing expression. > > * I like the fact that you define the semantics of the expression switch clauses in terms of desugaring to statements blocks - this is consistent with what we do in other areas (enhanced for loop, try with resources). Thanks! Although, with the proposed change to forbid fall through from statement groups into clauses, I?m not sure they can stay. > > * I suggest putting the paragraph in 15.29 starting with: > > "Given a switch expression, all of the following must be true" > > Ahead of the desugaring paragraph, which seems more execution/semantics-related, while this one is still about well-formedness. Yes! Thanks. > * On totality - this line: > > default -> 10; // Legal > > deserves some more explanation - e.g. one might think it's unreachable, but it's not because new constants could pop up at runtime; maybe add a clarification. Yes! Thanks. > > * On non-returning, this sentence is obscure: > > "Thus a switch expression block that can not complete normally, can only do so by occurrences of a break statement with an Expression. This ensures that a switch expression must either result in a value, or complete abruptly." > > because it contradicts what is said just a line above: > > "an occurrence of a break statement with an Expression in a switch expression means that the switch expression will complete normally with the the value of the Expression" > > The way I read this is: > > 1) the only way for the block after a 'case' in a switch pattern to complete abnormally is via a break expression > 2) even if the _block_ completes abnormally, the containing switch expression will complete normally, with the value of Expression > > Is that what you meant? Yes, although I?m not sure I quite see the ?contradiction?. I?ll take another look at this text. > * At the end of the switch expression section there are sub-optimal sentences like the one that appear for switch statements (e.g. "pattern matches") - see above. Okay, thanks. > > Cheers > > Maurizio > > On 12/04/18 22:27, Gavin Bierman wrote: >> I have uploaded a draft spec for JEP 325: Switch expressions at http://cr.openjdk.java.net/~gbierman/switch-expressions.html >> >> Note there are still three things missing: >> >> * There is no text about typing a switch expression, as this is still being discussed on this list. >> * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. >> * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. >> >> Comments welcomed! >> Gavin > From brian.goetz at oracle.com Wed Apr 18 17:58:31 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 Apr 2018 13:58:31 -0400 Subject: [records] Ancillary fields In-Reply-To: References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Message-ID: Seeing no dissent on the claim that the essential use case for ancillary fields is caching derived properties, let me talk about how I would like to handle this: lazy (final) fields. For background, this is something we've been exploring for a long time (see for example http://cr.openjdk.java.net/~jrose/draft/lazy-final.html), but this is also something that we can do in the context of the language if we're willing to relax the requirements a bit. The basic idea is that we can describe fields as `lazy` (either static or instance fields), with an initializer, which are implicitly `final`, and have the compiler rewrite reads of those fields to do a lazy initialization instead.? For static fields, we can use ConstantDynamic and get lazy initialization for free; for instance fields, we have to do a little more work (CASes, fences), but the game is the same. This is useful well beyond records.? For example, classes like `String` cache a lazily computed has code; these classes could just do ??? private int cacheHash = computeHashCode(); ??? public int hashCode() { return cacheHash; } It's also useful for frequently used static fields: ??? private lazy Logger logger = Logger.of("com.foo.bar"); Much lazy initialization code is error-prone, so this would eliminate those errors; its also tempting to avoid lazy initialization where it might be marginally useful.? (Static initializers are also one of the big pain points in AOT; this eliminates many static initializers.) What does this have to do with records?? Well, if the goal is to cache lazily computed values derived from the state, then lazy fields would give us that without opening up to the full generality of ancillary fields.? We'd then say that records can only have additional _lazy_ instance fields. (Sometimes lazy fields are cast in the opposite direction -- cached methods rather than lazy fields.? There are an obvious set of tradeoffs for how to structure it, but neither is strictly more powerful than the other.) On 4/13/2018 1:15 PM, Kevin Bourrillion wrote: > As one of the voices demanding we allow ancillary fields, I can > confirm that I had only these derived-state use cases in mind. I don't > see anything else as legitimate. That is, I think that the semantic > invariants you're trying to preserve for records are worth fighting > for, and additional /non-derived/?state would violate them. > > On Fri, Apr 13, 2018 at 9:46 AM, Brian Goetz > wrote: > > Let's see if we can make some progress on the elephant in the room > -- ancillary fields.? Several have expressed the concern that > without the ability to declare some additional instance state, the > feature will be too limited. > > The argument in favor of additional fields is the obvious one; > more classes can be records.? And there are some arguably valid > use cases for additional fields that don't conflict with the > design center for records.? The best example is derived state: > > ?- When a field is a cached property derived from the record state > (such as how String caches its hashCode) > > Arguably, if a field is derived deterministically from immutable > record state, then it is not creating any new record state.? This > surely seems within the circle. > > The argument against is more of a slippery-slope one; I believe > developers would like to view this feature through the lens of > syntactic boilerplate, rather than through semantics.? If we let > them, they would surely and routinely do the following: > > ??? record A(int a, int b) { > ??????? private int c; > > ??????? public A(int a, int b, int c) { > ??????????? this(a, b); > ??????????? this.c = c; > ??????? } > > ??????? public boolean equals(Object other) { > ??????????? return default.equals(other) && ((A) other).c == c; > ??????? } > ??? } > > Here, `c` is surely part of the state of `A`.? And, they wouldn't > even know what they'd lost; they would just assume records are a > way of "kickstarting" a class declaration with some public fields, > and then you can mix in whatever private state you want. > > Why is this bad?? While "reduced-boilerplate classes" is a valid > feature idea, our design goal for records is much more than that. > The semantic constraints on records are valuable because they > yield useful invariants; that they are "just" their state vector, > that they can be freely taken apart and put back together with no > loss of information, and hence can be freely serialized/marshaled > to JSON and back, etc. > > We currently prohibit records like `A` via a number of > restrictions: no additional fields, no override of equals. We > don't need all of these restrictions to achieve the desired goal, > but we also can't relax them all without opening the gate.? So we > should decide carefully which we want to relax, as making the > wrong choice constrains us in the future. > > Before I dive into details of how we might extend records to > support the case of "cached derived state", I'd like to first come > to some agreement that this covers the use cases that we think > fall into the "legitimate" uses of additional fields. > > > > On 3/16/2018 2:55 PM, Brian Goetz wrote: > > There are a number of potentially open details on the design > for records.? My inclination is to start with the simplest > thing that preserves the flexibility and expectations we want, > and consider opening up later as necessary. > > One of the biggest issues, which Kevin raised as a > must-address issue, is having sufficient support for > precondition validation. Without foreclosing on the ability to > do more later with declarative guards, I think the recent > construction proposal meets the requirement for lightweight > enforcement with minimal or no duplication. I'm hopeful that > this bit is "there". > > Our goal all along has been to define records as being ?just > macros? for a finer-grained set of features.? Some of these > are motivated by boilerplate; some are motivated by semantics > (coupling semantics of API elements to state.)? In general, > records will get there first, and then ordinary classes will > get the more general feature, but the default answer for "can > you relax records, so I can use it in this case that almost > but doesn't quite fit" should be "no, but there will probably > be a feature coming that makes that class simpler, wait for that." > > > Some other open issues (please see my writeup at > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > for reference), and my current thoughts on these, are outlined > below. Comments welcome! > > ?- Extension.? The proposal outlines a notion of abstract > record, which provides a "width subtyped" hierarchy.? Some > have questioned whether this carries its weight, especially > given how Scala doesn't support case-to-case extension (some > see this as a bug, others as an existence proof.)? Records can > implement interfaces. > > ?- Concrete records are final.? Relaxing this adds complexity > to the equality story; I'm not seeing good reasons to do so. > > ?- Additional constructors.? I don't see any reason why > additional constructors are problematic, especially if they > are constrained to delegate to the default constructor (which > in turn is made far simpler if there can be statements ahead > of the this() call.) Users may find the lack of additional > constructors to be an arbitrary limitation (and they'd > probably be right.) > > ?- Static fields.? Static fields seem harmless. > > ?- Additional instance fields.? These are a much bigger > concern. While the primary arguments against them are of the > "slippery slope" variety, I still have deep misgivings about > supporting unrestricted non-principal instance fields, and I > also haven't found a reasonable set of restrictions that makes > this less risky.? I'd like to keep looking for a better story > here, before just caving on this, as I worry doing so will end > up biting us in the back. > > ?- Mutability and accessibility.? I'd like to propose an odd > choice here, which is: fields are final and package (protected > for abstract records) by default, but finality can be > explicitly opted out of (non-final) and accessibility can be > explicitly widened (public). > > ?- Accessors.? Perhaps the most controversial aspect is that > records are inherently transparent to read; if something wants > to truly encapsulate state, it's not a record.? Records will > eventually have pattern deconstructors, which will expose > their state, so we should go out of the gate with the > equivalent.? The obvious choice is to expose read accessors > automatically. (These will not be named getXxx; we are not > burning the ill-advised Javabean naming conventions into the > language, no matter how much people think it already is.)? The > obvious naming choice for these accessors is fieldName(). No > provision for write accessors; that's bring-your-own. > > ?- Core methods.? Records will get equals, hashCode, and > toString.? There's a good argument for making equals/hashCode > final (so they can't be explicitly redeclared); this gives us > stronger preservation of the data invariants that allow us to > safely and mechanically snapshot / serialize / marshal (we'd > definitely want this if we ever allowed additional instance > fields.)? No reason to suppress override of toString, though. > Records could be safely made cloneable() with automatic > support too (like arrays), but not clear if this is worth it > (its darn useful for arrays, though.)? I think the > auto-generated getters should be final too; this leaves arrays > as second-class components, but I am not sure that bothers me. > > > > > > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Apr 18 18:16:30 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 18 Apr 2018 11:16:30 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: If one of the patterns is a constant expression or enum constant that is > equal to the value of the selector expression, then we say that the pattern > *matches*. > I think "equal" is ambiguous for strings (and will be for doubles when they happen). switch (s) { > // Even default does not match > // Will throw an exception > default: > System.out.println("It's a string"); > } I think it would be good to show the modified example that uses `case null: default:` together in order to produce the expected default behavior. > *A pattern label can contain multiple patterns, and is said to match if > any one of these patterns matches. The pattern label can then be seen to be > a disjunction of its constituent patterns.*switch (day) { > case SATURDAY, SUNDAY: > // matches if it is a Saturday OR a Sunday > System.out.println("It's the weekend!"); > } Were we considering allowing `case *something*, default:` or `default, case *something*:`? Of course you would never ever actually *need* this... except in the one case that *something* is null. In a switch expression it would be sad to be forced to revert to the old syntax for only this reason. If we're not allowing that, perhaps that's worth pointing out. Example 14.11-1. Fall-Through in the switch Statement Since there's a whole section on this, it might be helpful to point out that when multiple labels are used with no intervening code (not using the new comma feature), this is* not* considered fall-through. Everyone gets confused about the terminology. Meh... > Evaluation of an expression can produce side effects, because expressions > may contain embedded assignments, increment operators, decrement operators, > and method invocations. > *In addition, lambda expressions and switch expressions have bodies that > may contain arbitrary statements.* A lambda "contains" statements *physically*, but nothing gets executed. If anything, it is anonymous *classes* that belong here (though maybe, arguably, that would be covered if "method invocations" was changed to "method or constructor invocations"?). Suggestion: "... because expressions may contain embedded assignments, increment operators, decrement operators, and method or constructor invocations, as well as arbitrary statements nested inside a switch expression." -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Apr 18 18:46:47 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 18 Apr 2018 11:46:47 -0700 Subject: [records] Ancillary fields In-Reply-To: References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Message-ID: Lazy initialization is a massive pain to get right, so I'm very intrigued by this proposal. On Wed, Apr 18, 2018 at 10:58 AM, Brian Goetz wrote: This is useful well beyond records. For example, classes like `String` > cache a lazily computed has code; these classes could just do > > private int cacheHash = computeHashCode(); > > public int hashCode() { return cacheHash; } > ('course, String itself may not use this, as it prefers to save memory by just letting rare values like "drumwood boulderhead" be uncacheable.) Ahh, you missed the `lazy` keyword on there :-) Which is good because it raises an issue: when you forget it, bad performance may result without other observable consequence. Although, it's already the case that reading code like the above ought to raise all kinds of alarm bells (e.g., now I want to go check which fields computeHashCode() might be referring to, and where *they're* initialized), so I *should* be looking for that `lazy` keyword to put my mind at ease. So maybe this is okay. I assume that, unlike other field initializers, I'm safe to refer to *any* other field regardless of how and where that field is initialized. Right? The intersection with primitives is interesting. I assume it gets secretly created as an Integer? So there's a little extra hidden memory consumption. For a reference type, what happens if the initialization produces `null`? (I suggest throwing NPE, because I think the alternatives are worse?) I pondered also allowing a method to be marked lazy (memoized, really) and let the field(s) be created behind the scenes to store its result, but the risk of that being applied to an impure method is probably too scary. On 4/13/2018 1:15 PM, Kevin Bourrillion wrote: > > As one of the voices demanding we allow ancillary fields, I can confirm > that I had only these derived-state use cases in mind. I don't see anything > else as legitimate. That is, I think that the semantic invariants you're > trying to preserve for records are worth fighting for, and additional > *non-derived* state would violate them. > > On Fri, Apr 13, 2018 at 9:46 AM, Brian Goetz > wrote: > >> Let's see if we can make some progress on the elephant in the room -- >> ancillary fields. Several have expressed the concern that without the >> ability to declare some additional instance state, the feature will be too >> limited. >> >> The argument in favor of additional fields is the obvious one; more >> classes can be records. And there are some arguably valid use cases for >> additional fields that don't conflict with the design center for records. >> The best example is derived state: >> >> - When a field is a cached property derived from the record state (such >> as how String caches its hashCode) >> >> Arguably, if a field is derived deterministically from immutable record >> state, then it is not creating any new record state. This surely seems >> within the circle. >> >> The argument against is more of a slippery-slope one; I believe >> developers would like to view this feature through the lens of syntactic >> boilerplate, rather than through semantics. If we let them, they would >> surely and routinely do the following: >> >> record A(int a, int b) { >> private int c; >> >> public A(int a, int b, int c) { >> this(a, b); >> this.c = c; >> } >> >> public boolean equals(Object other) { >> return default.equals(other) && ((A) other).c == c; >> } >> } >> >> Here, `c` is surely part of the state of `A`. And, they wouldn't even >> know what they'd lost; they would just assume records are a way of >> "kickstarting" a class declaration with some public fields, and then you >> can mix in whatever private state you want. >> >> Why is this bad? While "reduced-boilerplate classes" is a valid feature >> idea, our design goal for records is much more than that. The semantic >> constraints on records are valuable because they yield useful invariants; >> that they are "just" their state vector, that they can be freely taken >> apart and put back together with no loss of information, and hence can be >> freely serialized/marshaled to JSON and back, etc. >> >> We currently prohibit records like `A` via a number of restrictions: no >> additional fields, no override of equals. We don't need all of these >> restrictions to achieve the desired goal, but we also can't relax them all >> without opening the gate. So we should decide carefully which we want to >> relax, as making the wrong choice constrains us in the future. >> >> Before I dive into details of how we might extend records to support the >> case of "cached derived state", I'd like to first come to some agreement >> that this covers the use cases that we think fall into the "legitimate" >> uses of additional fields. >> >> >> >> On 3/16/2018 2:55 PM, Brian Goetz wrote: >> >>> There are a number of potentially open details on the design for >>> records. My inclination is to start with the simplest thing that preserves >>> the flexibility and expectations we want, and consider opening up later as >>> necessary. >>> >>> One of the biggest issues, which Kevin raised as a must-address issue, >>> is having sufficient support for precondition validation. Without >>> foreclosing on the ability to do more later with declarative guards, I >>> think the recent construction proposal meets the requirement for >>> lightweight enforcement with minimal or no duplication. I'm hopeful that >>> this bit is "there". >>> >>> Our goal all along has been to define records as being ?just macros? for >>> a finer-grained set of features. Some of these are motivated by >>> boilerplate; some are motivated by semantics (coupling semantics of API >>> elements to state.) In general, records will get there first, and then >>> ordinary classes will get the more general feature, but the default answer >>> for "can you relax records, so I can use it in this case that almost but >>> doesn't quite fit" should be "no, but there will probably be a feature >>> coming that makes that class simpler, wait for that." >>> >>> >>> Some other open issues (please see my writeup at >>> http://cr.openjdk.java.net/~briangoetz/amber/datum.html for reference), >>> and my current thoughts on these, are outlined below. Comments welcome! >>> >>> - Extension. The proposal outlines a notion of abstract record, which >>> provides a "width subtyped" hierarchy. Some have questioned whether this >>> carries its weight, especially given how Scala doesn't support case-to-case >>> extension (some see this as a bug, others as an existence proof.) Records >>> can implement interfaces. >>> >>> - Concrete records are final. Relaxing this adds complexity to the >>> equality story; I'm not seeing good reasons to do so. >>> >>> - Additional constructors. I don't see any reason why additional >>> constructors are problematic, especially if they are constrained to >>> delegate to the default constructor (which in turn is made far simpler if >>> there can be statements ahead of the this() call.) Users may find the lack >>> of additional constructors to be an arbitrary limitation (and they'd >>> probably be right.) >>> >>> - Static fields. Static fields seem harmless. >>> >>> - Additional instance fields. These are a much bigger concern. While >>> the primary arguments against them are of the "slippery slope" variety, I >>> still have deep misgivings about supporting unrestricted non-principal >>> instance fields, and I also haven't found a reasonable set of restrictions >>> that makes this less risky. I'd like to keep looking for a better story >>> here, before just caving on this, as I worry doing so will end up biting us >>> in the back. >>> >>> - Mutability and accessibility. I'd like to propose an odd choice >>> here, which is: fields are final and package (protected for abstract >>> records) by default, but finality can be explicitly opted out of >>> (non-final) and accessibility can be explicitly widened (public). >>> >>> - Accessors. Perhaps the most controversial aspect is that records are >>> inherently transparent to read; if something wants to truly encapsulate >>> state, it's not a record. Records will eventually have pattern >>> deconstructors, which will expose their state, so we should go out of the >>> gate with the equivalent. The obvious choice is to expose read accessors >>> automatically. (These will not be named getXxx; we are not burning the >>> ill-advised Javabean naming conventions into the language, no matter how >>> much people think it already is.) The obvious naming choice for these >>> accessors is fieldName(). No provision for write accessors; that's >>> bring-your-own. >>> >>> - Core methods. Records will get equals, hashCode, and toString. >>> There's a good argument for making equals/hashCode final (so they can't be >>> explicitly redeclared); this gives us stronger preservation of the data >>> invariants that allow us to safely and mechanically snapshot / serialize / >>> marshal (we'd definitely want this if we ever allowed additional instance >>> fields.) No reason to suppress override of toString, though. Records could >>> be safely made cloneable() with automatic support too (like arrays), but >>> not clear if this is worth it (its darn useful for arrays, though.) I >>> think the auto-generated getters should be final too; this leaves arrays as >>> second-class components, but I am not sure that bothers me. >>> >>> >>> >>> >>> >>> >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Wed Apr 18 19:02:58 2018 From: alex.buckley at oracle.com (Alex Buckley) Date: Wed, 18 Apr 2018 12:02:58 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <5AD79662.60605@oracle.com> On 4/18/2018 11:16 AM, Kevin Bourrillion wrote: > Evaluation of an expression can produce side effects, because > expressions may contain embedded assignments, increment operators, > decrement operators, and method invocations. *In addition, lambda > expressions and switch expressions have bodies that may contain > arbitrary statements. > > A lambda "contains" statements /physically/, but nothing gets > executed. If anything, it is anonymous /classes/ that belong here > (though maybe, arguably, that would be covered if "method invocations" > was changed to "method or constructor invocations"?). The goal was to highlight that a lambda/switch expression is not like (say) a field access expression, because of the ability to have a body of statements rather than merely a tree of subexpressions ... but you're right, "Evaluation of a lambda expression is distinct from execution of the lambda body." (JLS 15.27.4) > Suggestion: "... because expressions may contain embedded assignments, > increment operators, decrement operators, and method or constructor > invocations, as well as arbitrary statements nested inside a switch > expression." Yes, limiting the arbitrariness to switch expressions (the sole "home" for something-resembling-block-expressions) is right. Alex From brian.goetz at oracle.com Wed Apr 18 19:30:23 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 Apr 2018 15:30:23 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <129fb52c-d6b9-cf73-b664-1b888b1b8f56@oracle.com> All good points.? Minor comments inline. > |Were we considering allowing `case /something/, default:` or > `default, case /something/:`? Of course you would never ever actually > /need/ this... except in the one case that /something/ is null. In a > switch expression it would be sad to be forced to revert to the old > syntax for only this reason.| |This may well be needed, especially if we prohibit fallthrough from a colon label into a arrow label. Another case where a simliar problem arises is this: ??? case null: ??? case String s: ??????? // whoops, s is not DA here Really, we'd like to say ??? case null s, String s: or ??? case (null | String) s: or something similar.? We don't have to cross this until we get to type patterns, but it's on the horizon. | -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Apr 18 20:39:12 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 Apr 2018 16:39:12 -0400 Subject: [records] Ancillary fields In-Reply-To: References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Message-ID: <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> > Ahh, you missed the `lazy` keyword on there :-) Which is good because > it raises an issue: when you forget it, bad performance may result > without other observable consequence. Although, it's already the case > that reading code like the above ought to raise all kinds of alarm > bells (e.g., now I want to go check which fields computeHashCode() > might be referring to, and where /they're/?initialized), so I > /should/?be looking for that `lazy` keyword to put my mind at ease. So > maybe this is okay. Well, "bad" is relative; it won't be any worse than what you do today with eager static fields.? But yes, I did drop the lazy there. > I assume that, unlike other field initializers, I'm safe to refer > to/any/?other field regardless of how and where that field is > initialized. Right? I think you mostly are asking about instance fields.? It would be safe to refer to any other field, however, if you _read_ a lazy field in the constructor, it might trigger computation of the field based on a partially initialized object.? The compiler could warn on the obvious cases where this happens, but of course it can be buried in a chain of method calls. > The intersection with primitives is interesting. I assume it gets > secretly created as an Integer? So there's a little extra hidden > memory consumption. For static fields, there's an obvious and good answer that is optimally time and space efficient with no anomalies: condy.? We desugar ??? lazy static T t = e ??? ... ??? moo(t) into ??? // no field needed ??? static t$init() { return ; } ??? ... ??? moo( ldc condy[ ... ] ) and let the constant pool do the lazy initialization and caching. JITs love this. For instance fields, we have a choice; use extra space in the object to store the "already initialized" bit, or satisfy ourselves with the trick that String does with hashCode() -- allow redundant recomputation in the case where the initializer serves up the default value. So I think the divide is not ref-vs-primitive but whether we are willing to take the recomputation hit when it serves up a default value. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Apr 18 21:45:12 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 18 Apr 2018 21:45:12 +0000 Subject: [records] Ancillary fields In-Reply-To: <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> Message-ID: <02A0AF7A-218C-4FC6-9EB9-4570588533DE@univ-mlv.fr> On April 18, 2018 8:39:12 PM UTC, Brian Goetz wrote: > > >> Ahh, you missed the `lazy` keyword on there :-) Which is good because > >> it raises an issue: when you forget it, bad performance may result >> without other observable consequence. Although, it's already the case > >> that reading code like the above ought to raise all kinds of alarm >> bells (e.g., now I want to go check which fields computeHashCode() >> might be referring to, and where /they're/?initialized), so I >> /should/?be looking for that `lazy` keyword to put my mind at ease. >So >> maybe this is okay. > >Well, "bad" is relative; it won't be any worse than what you do today >with eager static fields.? But yes, I did drop the lazy there. > >> I assume that, unlike other field initializers, I'm safe to refer >> to/any/?other field regardless of how and where that field is >> initialized. Right? > >I think you mostly are asking about instance fields.? It would be safe >to refer to any other field, however, if you _read_ a lazy field in the > >constructor, it might trigger computation of the field based on a >partially initialized object.? The compiler could warn on the obvious >cases where this happens, but of course it can be buried in a chain of >method calls. > >> The intersection with primitives is interesting. I assume it gets >> secretly created as an Integer? So there's a little extra hidden >> memory consumption. > >For static fields, there's an obvious and good answer that is optimally > >time and space efficient with no anomalies: condy.? We desugar > > ??? lazy static T t = e > ??? ... > ??? moo(t) > >into > > ??? // no field needed > ??? static t$init() { return ; } > ??? ... > ??? moo( ldc condy[ ... ] ) > >and let the constant pool do the lazy initialization and caching. JITs >love this. > >For instance fields, we have a choice; use extra space in the object to > >store the "already initialized" bit, or satisfy ourselves with the >trick >that String does with hashCode() -- allow redundant recomputation in >the >case where the initializer serves up the default value. > >So I think the divide is not ref-vs-primitive but whether we are >willing >to take the recomputation hit when it serves up a default value. I fully agree. The lazy static with condy also has the same semantics, if the bsm do a side effect you may see that the bsm can be called multiple times. For the record, I've just presented the lazy static this afternoon at devoxx fr (in order to explain the semantics of condy) and several people reach me afterward saying it was in interesting idea. Remi -- Sent from my Android device with K-9 Mail. Please excuse my brevity. From kevinb at google.com Wed Apr 18 21:59:02 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 18 Apr 2018 14:59:02 -0700 Subject: [records] Ancillary fields In-Reply-To: <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> Message-ID: On Wed, Apr 18, 2018 at 1:39 PM, Brian Goetz wrote: > Ahh, you missed the `lazy` keyword on there :-) Which is good because it > raises an issue: when you forget it, bad performance may result without > other observable consequence. Although, it's already the case that reading > code like the above ought to raise all kinds of alarm bells (e.g., now I > want to go check which fields computeHashCode() might be referring to, and > where *they're* initialized), so I *should* be looking for that `lazy` > keyword to put my mind at ease. So maybe this is okay. > > > Well, "bad" is relative; it won't be any worse than what you do today with > eager static fields. > Yes, it's just that lazy and eager code aren't as trivially distinguishable anymore, so... I thought I should mention it, but it's no kind of dealbreaker. > For instance fields, we have a choice; use extra space in the object to > store the "already initialized" bit, or satisfy ourselves with the trick > that String does with hashCode() -- allow redundant recomputation in the > case where the initializer serves up the default value. > I strongly suspect there isn't going to be any generally safe way to do the latter. So I think the divide is not ref-vs-primitive but whether we are willing to > take the recomputation hit when it serves up a default value. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Apr 18 22:19:25 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 Apr 2018 18:19:25 -0400 Subject: [records] Ancillary fields In-Reply-To: References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> <74880bca-fc77-0d84-3827-025b241c6a6e@oracle.com> Message-ID: <1ce6065a-3619-b32c-e3af-0f51e130bc77@oracle.com> For primitives, you can always force yourself to use Integer: ??? lazy Integer i = f(); and make sure f() never returns null.? You can do something similar with a library class (e.g., Optional) for references.? So there are surely _safe_ ways to do it, albeit ugly ones. I kind of prefer to have boxing like this be explicit rather than implicit; if the user thinks they're putting an `int` in their class, I'd like to be as transparent about that as we can. You were willing to throw on null in the reference case; that can also be simulated by: ??? lazy Foo f = requireNonNull(f()); Which isn't even that ugly or expensive.? So I suspect that this is less of a problem that one might first think, but I could be wrong. On 4/18/2018 5:59 PM, Kevin Bourrillion wrote: > > For instance fields, we have a choice; use extra space in the > object to store the "already initialized" bit, or satisfy > ourselves with the trick that String does with hashCode() -- allow > redundant recomputation in the case where the initializer serves > up the default value. > > > I strongly suspect there isn't going to be any generally safe way to > do the latter. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 19 20:44:45 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 Apr 2018 16:44:45 -0400 Subject: [switch] Further unification on switch Message-ID: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> We've been reviewing the work to date on switch expressions. Here's where we are, and here's a possible place we might move to, which I like a lot better than where we are now. ## Goals As a reminder, remember that the primary goal here is _not_ switch expressions; switch expressions are supposed to just be an uncontroversial waypoint on the way to the real goal, which is a more expressive and flexible switch construct that works in a wider variety of situations, including supporting patterns, being less hostile to null, use as either an expression or a statement, etc. And the reason we think that improving switch is the right primary goal is because a "do one of these based on ..." construct is _better_ than the corresponding chain of if-else-if, for multiple reasons: ?- Possibility for the compiler to do exhaustiveness analysis, potentially finding more bugs; ?- Possibility for more efficient dispatch -- a switch could be O(1), whereas an if-else chain is almost certainly O(n); ?- More semantically transparent -- it's obvious the user is saying "do one of these, based on ..."; ?- Eliminates the need to repeat (and possibly get wrong) the switch target. Switch does come with a lot of baggage (fallthrough by default, questionable scoping, need to explicitly break), and this baggage has produced the predictable distractions in the discussion -- a desire that we subordinate the primary goal (making switch more expressive) to the more contingent goal of "fixing" the legacy problems of switch. These legacy problems of switch may be unfortunate, but to whatever degree we end up ameliorating these, this has to be purely a side-benefit -- it's not the primarily goal, no matter how annoying people find them.? (The desire to "fix" the mistakes of the past is frequently a siren song, which is why we don't allow ourselves to take these as first-class requirements.) #### What we're not going to do The worst possible outcome (which is also the most commonly suggested "solution" in forums like reddit) would be to invent a new construct that is similar to, but not quite the same as switch (`snitch`), without being a 100% replacement for today's quirky switch.? Today's switch is surely suboptimal, but it's not so fatally flawed that it needs to be euthanized, and we don't want to create an "undead" language construct forever, which everyone will still have to learn, and keep track of the differences between `switch` and `snitch`.? No thank you. That means we extend the existing switch statement, and increase flexibility by supporting an expression form, and to the degree needed, embrace its quirks.? ("No statement left behind.") #### Where we started In the first five minutes of working on this project, we sketched out the following (call it the "napkin sketch"), where an expression switch has case arms of the form: ?? case L -> e; or ?? case L -> { statement*; break e; } This was enough to get started, but of course the devil is in the details. #### Where we are right now We moved away from the napkin sketch for a few reasons, in part because it seemed to be drawing us down the road towards switch and snitch -- which was further worrying as we still had yet to deal with the potential that pattern switch and constant switch might have differences as well.? We want a unified model of switch that deals well enough with all the cases -- expressions and statements, patterns and constants. Our current model (call this Unification Attempt #1, or UA1 for short) is a step towards a unified model of switch, and this is a huge step forward.? In this model, there's one switch construct, and there's one set of control flow rules, including for break (like return, break takes a value in a value context and is void in a void context). For convenience and safety, we then layered a shorthand atop value-bearing switches, which is to interpret ??? case L -> e; as ??? case L: break e; expecting the shorter form would be used almost all the time.? (This has a pleasing symmetry with the expression form of lambdas, and (at least for expression switches) alleviates two of the legacy pain points.? Switch expressions have other things in common with lambdas too; they are the only ones that can have statements; they are the only ones that interact with nonlocal control flow.) This approach offers a lot of flexibility (some would say too much).? You can write "remi-style" expression switches: ??? int x = switch (y) { ??????? case 1: break 2; ??????? case 2: break 4; ??????? default: break 8; ??? }; or you can write "new-style" expression switches: ??? int x = switch (y) { ??????? case 1 -> 2; ??????? case 2-> 4; ??????? default-> 8; ??? }; Some people like the transparency of the first; others like the compactness and fallthrough-safety of the second.? And in cases where you mostly want the benefits of the second, but the real world conspires to make one or two cases difficult, you can mix them, and take full advantage of what "old switch" does -- with no new rules for control flow. #### Complaints There were the usual array of complaints over syntax -- many of which can be put down to "bleah, new is different, different is bad", but the most prominent one seems to be a generalized concern that other users (never us, of course, but we always fear for what others might do) won't be able to "handle" the power of mixed switches and will write terrible code, and then the world will burn.? (And, because the mixing comes with fallthrough, it further engenders the "you idiots, you fixed the wrong thing" reactions.) Personally, I think the fear of mixing is deeply overblown -- I think in most cases people will gravitate towards one of the two clean styles, and only mix where the complexity of the real world forces them to, but there's value in understanding the underpinnings of such reactions, even if in the end they'd turn out to be much hot air about nothing. #### A real issue with mixing! But, there is a real problem with our approach, which is: while a unified switch is the right goal, UA1 is not unified _enough_. Specifically, we haven't fully aligned the statement forms, and this conspires to reduce expressiveness and safety.? That is, in an expression switch you can say: ??? case L -> e; but in a statement switch you can't say ??? case L -> s; The reason for this is a purely accidental one: if we allowed this, then we _would_ likely find ourselves in the mixing hell that people are afraid of, which in turn would make the risk of accidental fallthrough _even worse_ than it is today.? So the failing of mixing is not that it will be abused, but that it constrains us from actually getting to a unified construct. ## Closing the gap So, let's take one more step towards unifying the two forms (call this UA2), rather than a step away from it.? Let's say that _all_ switches can support either old-style (colon) or new-style (arrow) case labels -- but must stick to one kind of case label in a given switch: ??? // statement switch ??? switch (x) { ??????? case 1: println("one"); break; ??????? case 2: println("two"); break; ??? } or ??? // also statement switch ??? switch (x) { ??????? case 1 -> println("one"); ??????? case 2 -> println("two"); ??? } If a switch is a statement, the RHS is a statement, which can be a block statement: ??? case L -> { a; b; } We get there by first taking a step backwards, at least in terms of superficial syntax, to the syntax suggested by the napkin sketch, where if a switch is an expression, the RHS of an -> case is an expression or a block statement (in the latter case, it must complete abruptly by reason of either break-value or throw).? Just as we expected "break value" to be rare in expression switches under UA1 since developers will generally prefer the shorthand form where applicable, we expect it to be equally rare under UA2. Then, as in UA1, we render unto expressions the things that belong to expressions; they must be total (an expression must yield a value or complete abruptly by reason of throwing.) #### Look, accidental benefits! Many of switches failings (fallthrough, scoping) are not directly specified features, as much as emergent properties of the structure and control flow of switches.? Since by definition you can't fall out of a arrow case, then an all-arrow switch gives the fallthrough-haters what they want "for free", with no need to treat it specially. In fact, its even better; in the all-arrow form, all of the things people hate about switch -- the need to say break, the risk of fallthrough, and the questionable scoping -- all go away. #### Scorecard There is one switch construct, which can be use as either an expression or a statement; when used as an expression, it acquires the characteristics of expressions (must be total, no nonlocal control flow out.)? Each can be expressed in one of two syntactic forms (arrow and colon.)? All forms will support patterns, null handling, and multiple labels per case.? The control flow and scoping rules are driven by structural properties of the chosen form. The (statement, colon) case is the switch we have since Java 1.0, enhanced as above (patterns, nulls, etc.) The (statement, arrow) case can be considered a nice syntactic shorthand for the previous, which obviates the annoyance of "break", implicitly prevents fallthrough of all forms, and avoids the confusion of current switch scoping.? Many existing statement switches that are not expressions in disguise can be refactored to this. The (expression, colon) form is a subset of UA1, where you just never say "arrow". The (expression, arrow) case can again be considered a nice shorthand for the previous, again a subset of UA1, where you just never say "colon", and as a result, again don't have to think about fallthrough. Totality is a property of expression switches, regardless of form, because they are expressions, and expressions must be total. Fallthrough is a property of the colon-structured switches; there are no changes here. Nonlocal control flow _out_ of a switch (continue to an enclosing loop, break with label, return) are properties of statement switches. So essentially, rather than dividing the semantics along expression/statement lines, and then attempting to opportunistically heap a bunch of irrelevant features like "no fallthrough" onto the expression side "because they're cool" even though they have nothing to do with expression-ness, we instead divide the world structurally: the colon form gives you the old control flow, and the arrow form gives you the new.? And either can be used as a statement, or an expression.? And no one will be confused by mixing. Orthogonality FTW.? No statement gets left behind. ## Explaining it Relative to UA1, we could describe this as adding back the blocks (its not really a block expression) from the napkin model, supporting an arrow form of statement switches with blocks too, and then restricting switches to all-arrow or all-colon.? Then each quadrant is a restriction of this model.? But that's not how we'd teach it. Relative to Java 10, we'd probably say: ?- Switch statements now come in a simpler (arrow) flavor, where there is no fallthrough, no weird scoping, and no need to say break most of the time.? Many switches can be rewritten this way, and this form can even be taught first. ?- Switches can be used as either expressions or statements, with essentially identical syntax (some grammar differences, but this is mostly interesting only to spec writers).? If a switch is an expression, it should contain expressions; if a switch is a statement, it should contain statements. ?- Expression switches have additional restrictions that are derived exclusively from their expression-ness: totality, can only complete abruptly if by reason of throw. ?- We allow a break-with-value statement in an expression switch as a means of explicitly providing the switch result; this can be combined with a statement block to allow for statements+break-expression. The result is one switch construct, with modern and legacy flavors, which supports either expressions or statements.? You can immediately look at the middle of a switch and tell (by arrow vs colon) whether it has the legacy control flow or not. From guy.steele at oracle.com Thu Apr 19 21:06:35 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 19 Apr 2018 17:06:35 -0400 Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: > On Apr 19, 2018, at 4:44 PM, Brian Goetz wrote: > > We've been reviewing the work to date on switch expressions. Here's where we are, and here's a possible place we might move to, which I like a lot better than where we are now. > . . . > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this UA2), rather than a step away from it. Let's say that _all_ switches can support either old-style (colon) or new-style (arrow) case labels -- but must stick to one kind of case label in a given switch . . . > > The result is one switch construct, with modern and legacy flavors, which supports either expressions or statements. You can immediately look at the middle of a switch and tell (by arrow vs colon) whether it has the legacy control flow or not. I like it. I would like to think that an IDE could help you with changing between colon and arrow flavors. ?Guy From dl at cs.oswego.edu Thu Apr 19 21:31:42 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 19 Apr 2018 17:31:42 -0400 Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: <806dfd48-b19f-919b-16bd-4edee8f680a7@cs.oswego.edu> I was starting to get fatalistically pessimistic about switch, but the all-colon-as-statement vs all-arrow-as-expression idea (with nothing in-between) seems pretty good! And would be even better if JLS impact were carefully checked. -Doug On 04/19/2018 04:44 PM, Brian Goetz wrote: > We've been reviewing the work to date on switch expressions. Here's > where we are, and here's a possible place we might move to, which I like > a lot better than where we are now. > > ## Goals > > As a reminder, remember that the primary goal here is _not_ switch > expressions; switch expressions are supposed to just be an > uncontroversial waypoint on the way to the real goal, which is a more > expressive and flexible switch construct that works in a wider variety > of situations, including supporting patterns, being less hostile to > null, use as either an expression or a statement, etc. > > And the reason we think that improving switch is the right primary goal > is because a "do one of these based on ..." construct is _better_ than > the corresponding chain of if-else-if, for multiple reasons: > > ?- Possibility for the compiler to do exhaustiveness analysis, > potentially finding more bugs; > ?- Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); > ?- More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > ?- Eliminates the need to repeat (and possibly get wrong) the switch > target. > > Switch does come with a lot of baggage (fallthrough by default, > questionable scoping, need to explicitly break), and this baggage has > produced the predictable distractions in the discussion -- a desire that > we subordinate the primary goal (making switch more expressive) to the > more contingent goal of "fixing" the legacy problems of switch. > > These legacy problems of switch may be unfortunate, but to whatever > degree we end up ameliorating these, this has to be purely a > side-benefit -- it's not the primarily goal, no matter how annoying > people find them.? (The desire to "fix" the mistakes of the past is > frequently a siren song, which is why we don't allow ourselves to take > these as first-class requirements.) > > #### What we're not going to do > > The worst possible outcome (which is also the most commonly suggested > "solution" in forums like reddit) would be to invent a new construct > that is similar to, but not quite the same as switch (`snitch`), without > being a 100% replacement for today's quirky switch.? Today's switch is > surely suboptimal, but it's not so fatally flawed that it needs to be > euthanized, and we don't want to create an "undead" language construct > forever, which everyone will still have to learn, and keep track of the > differences between `switch` and `snitch`.? No thank you. > > That means we extend the existing switch statement, and increase > flexibility by supporting an expression form, and to the degree needed, > embrace its quirks.? ("No statement left behind.") > > #### Where we started > > In the first five minutes of working on this project, we sketched out > the following (call it the "napkin sketch"), where an expression switch > has case arms of the form: > > ?? case L -> e; > or > ?? case L -> { statement*; break e; } > > This was enough to get started, but of course the devil is in the details. > > #### Where we are right now > > We moved away from the napkin sketch for a few reasons, in part because > it seemed to be drawing us down the road towards switch and snitch -- > which was further worrying as we still had yet to deal with the > potential that pattern switch and constant switch might have differences > as well.? We want a unified model of switch that deals well enough with > all the cases -- expressions and statements, patterns and constants. > > Our current model (call this Unification Attempt #1, or UA1 for short) > is a step towards a unified model of switch, and this is a huge step > forward.? In this model, there's one switch construct, and there's one > set of control flow rules, including for break (like return, break takes > a value in a value context and is void in a void context). > > For convenience and safety, we then layered a shorthand atop > value-bearing switches, which is to interpret > > ??? case L -> e; > > as > > ??? case L: break e; > > expecting the shorter form would be used almost all the time.? (This has > a pleasing symmetry with the expression form of lambdas, and (at least > for expression switches) alleviates two of the legacy pain points.? > Switch expressions have other things in common with lambdas too; they > are the only ones that can have statements; they are the only ones that > interact with nonlocal control flow.) > > This approach offers a lot of flexibility (some would say too much).? > You can write "remi-style" expression switches: > > ??? int x = switch (y) { > ??????? case 1: break 2; > ??????? case 2: break 4; > ??????? default: break 8; > ??? }; > > or you can write "new-style" expression switches: > > ??? int x = switch (y) { > ??????? case 1 -> 2; > ??????? case 2-> 4; > ??????? default-> 8; > ??? }; > > Some people like the transparency of the first; others like the > compactness and fallthrough-safety of the second.? And in cases where > you mostly want the benefits of the second, but the real world conspires > to make one or two cases difficult, you can mix them, and take full > advantage of what "old switch" does -- with no new rules for control flow. > > #### Complaints > > There were the usual array of complaints over syntax -- many of which > can be put down to "bleah, new is different, different is bad", but the > most prominent one seems to be a generalized concern that other users > (never us, of course, but we always fear for what others might do) won't > be able to "handle" the power of mixed switches and will write terrible > code, and then the world will burn.? (And, because the mixing comes with > fallthrough, it further engenders the "you idiots, you fixed the wrong > thing" reactions.) Personally, I think the fear of mixing is deeply > overblown -- I think in most cases people will gravitate towards one of > the two clean styles, and only mix where the complexity of the real > world forces them to, but there's value in understanding the > underpinnings of such reactions, even if in the end they'd turn out to > be much hot air about nothing. > > #### A real issue with mixing! > > But, there is a real problem with our approach, which is: while a > unified switch is the right goal, UA1 is not unified _enough_. > Specifically, we haven't fully aligned the statement forms, and this > conspires to reduce expressiveness and safety.? That is, in an > expression switch you can say: > > ??? case L -> e; > > but in a statement switch you can't say > > ??? case L -> s; > > The reason for this is a purely accidental one: if we allowed this, then > we _would_ likely find ourselves in the mixing hell that people are > afraid of, which in turn would make the risk of accidental fallthrough > _even worse_ than it is today.? So the failing of mixing is not that it > will be abused, but that it constrains us from actually getting to a > unified construct. > > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it.? Let's say that _all_ switches > can support either old-style (colon) or new-style (arrow) case labels -- > but must stick to one kind of case label in a given switch: > > ??? // statement switch > ??? switch (x) { > ??????? case 1: println("one"); break; > ??????? case 2: println("two"); break; > ??? } > > or > > ??? // also statement switch > ??? switch (x) { > ??????? case 1 -> println("one"); > ??????? case 2 -> println("two"); > ??? } > > If a switch is a statement, the RHS is a statement, which can be a block > statement: > > ??? case L -> { a; b; } > > We get there by first taking a step backwards, at least in terms of > superficial syntax, to the syntax suggested by the napkin sketch, where > if a switch is an expression, the RHS of an -> case is an expression or > a block statement (in the latter case, it must complete abruptly by > reason of either break-value or throw).? Just as we expected "break > value" to be rare in expression switches under UA1 since developers will > generally prefer the shorthand form where applicable, we expect it to be > equally rare under UA2. > > Then, as in UA1, we render unto expressions the things that belong to > expressions; they must be total (an expression must yield a value or > complete abruptly by reason of throwing.) > > #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches.? Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what > they want "for free", with no need to treat it specially. In fact, its > even better; in the all-arrow form, all of the things people hate about > switch -- the need to say break, the risk of fallthrough, and the > questionable scoping -- all go away. > > #### Scorecard > > There is one switch construct, which can be use as either an expression > or a statement; when used as an expression, it acquires the > characteristics of expressions (must be total, no nonlocal control flow > out.)? Each can be expressed in one of two syntactic forms (arrow and > colon.)? All forms will support patterns, null handling, and multiple > labels per case.? The control flow and scoping rules are driven by > structural properties of the chosen form. > > The (statement, colon) case is the switch we have since Java 1.0, > enhanced as above (patterns, nulls, etc.) > > The (statement, arrow) case can be considered a nice syntactic shorthand > for the previous, which obviates the annoyance of "break", implicitly > prevents fallthrough of all forms, and avoids the confusion of current > switch scoping.? Many existing statement switches that are not > expressions in disguise can be refactored to this. > > The (expression, colon) form is a subset of UA1, where you just never > say "arrow". > > The (expression, arrow) case can again be considered a nice shorthand > for the previous, again a subset of UA1, where you just never say > "colon", and as a result, again don't have to think about fallthrough. > > Totality is a property of expression switches, regardless of form, > because they are expressions, and expressions must be total. > > Fallthrough is a property of the colon-structured switches; there are no > changes here. > > Nonlocal control flow _out_ of a switch (continue to an enclosing loop, > break with label, return) are properties of statement switches. > > So essentially, rather than dividing the semantics along > expression/statement lines, and then attempting to opportunistically > heap a bunch of irrelevant features like "no fallthrough" onto the > expression side "because they're cool" even though they have nothing to > do with expression-ness, we instead divide the world structurally: the > colon form gives you the old control flow, and the arrow form gives you > the new.? And either can be used as a statement, or an expression.? And > no one will be confused by mixing. > > Orthogonality FTW.? No statement gets left behind. > > ## Explaining it > > Relative to UA1, we could describe this as adding back the blocks (its > not really a block expression) from the napkin model, supporting an > arrow form of statement switches with blocks too, and then restricting > switches to all-arrow or all-colon.? Then each quadrant is a restriction > of this model.? But that's not how we'd teach it. > > Relative to Java 10, we'd probably say: > > ?- Switch statements now come in a simpler (arrow) flavor, where there > is no fallthrough, no weird scoping, and no need to say break most of > the time.? Many switches can be rewritten this way, and this form can > even be taught first. > ?- Switches can be used as either expressions or statements, with > essentially identical syntax (some grammar differences, but this is > mostly interesting only to spec writers).? If a switch is an expression, > it should contain expressions; if a switch is a statement, it should > contain statements. > ?- Expression switches have additional restrictions that are derived > exclusively from their expression-ness: totality, can only complete > abruptly if by reason of throw. > ?- We allow a break-with-value statement in an expression switch as a > means of explicitly providing the switch result; this can be combined > with a statement block to allow for statements+break-expression. > > The result is one switch construct, with modern and legacy flavors, > which supports either expressions or statements.? You can immediately > look at the middle of a switch and tell (by arrow vs colon) whether it > has the legacy control flow or not. > > > From kevinb at google.com Thu Apr 19 21:43:30 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 19 Apr 2018 14:43:30 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: Necromancing, since I noticed that the spec still contains a hole where this name would go. *Name:* - I think something specific like UnexpectedEnumConstantE{rror,xception} would seem the right way to go. (Perhaps "Unrecognized"?) *Hierarchy: * - It will want a common supertype it can share with the future "unexpected subtype of sealed type" error/exception. - As for where that supertype goes, I still maintain that this is *exactly* an IncompatibleClassChangeError (argument below), and thus should be a subtype of that. I also see nothing harmed by it being an Error instead of Exception. My claim is that releasing an enum with a certain set of constants is qualitatively equivalent to releasing an interface with a certain set of abstract methods. We know that people key behavior off of enums (that's what enum switch is all about). That means that when we add a constant, we are adding new *contract*, which we (the enum owners) don't know how to fulfill. The call sites need to fulfill it. Thought experiment: I can already implement an interface in two different ways: the normal way, or via a dynamic proxy that throws an exception if it gets an unexpected method. Let's imagine that the latter way was made exactly as easy to express as the former. I think everyone would probably agree that most implementations would *still* choose the current behavior. (Yes?) They don't *want* anything to fail at runtime that could instead fail at compile-time. Anyway, all of this is just to support the notion that this should be an IncompatibleClassChangeError. Of course, the argument's been made in this thread that it *is* different from an incompatible class change. My response was that these reasons seem way too subtle to me. Or, have I been persistently missing something? On Fri, Mar 30, 2018 at 11:31 AM, Kevin Bourrillion wrote: > On Fri, Mar 30, 2018 at 10:48 AM, Brian Goetz > wrote: > > Backing way up, Alex had suggested that the right exception is (a subtype >> of) IncompatibleClassChangeEXCEPTION, rather than Error. I was >> concerned that ICC* would seem too low-level to users, though. But you're >> saying ICCE and subtypes are helpful to suers, because they guide users to >> "blame your classpath". SO in that case, is the ICC part a good enough >> trigger? >> > > (Just to be clear, Remi and I have been advocating for a subtype of ICC > *Error* all along, in case anyone missed that.) > > All right, I've been focusing too much on the hierarchy, but the > leaf-level name is more important than that (and the message text further > still, and since I assume we'll do a fine job of that, I can probably relax > a little). To answer your question, sure, the "ICC" is a pretty decent > signal. Have we discussed Cyrill's point on -observers that we should > create more specific exception types, such as UnrecognizedEnumConstantE{rror > ,xception}? > > > For an enum in the same class/package/module as the switch, the chance of >> getting the error at runtime is either zero (same class) or effectively >> zero (same package or module), because all sane developers build packages >> and modules in an atomic operation. >> >> For an enum in a different module as the switch, the chance of getting >> the error at runtime is nonzero, because we're linking against a JAR at >> runtime. >> >> So an alternative here is to tweak the language so that the "conclude >> exhaustiveness if all enum constants are present" behavior should be >> reserved for the cases where the switch and the enum are in the same >> module? >> >> (Just a thought.) >> > > Okay, that is a sane approach, but I think it leaves too much of the value > on the floor. I often benefit from having my exhaustiveness validated and > being able to find out at compile time if things change in the future. > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 19 21:50:19 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 Apr 2018 17:50:19 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: I like Un{recognized,known}EnumConstantE{rror,xception}.? When we get to sealed types, it will be the same but with something like s/EnumConstant/SealedTypeMember/. I am still having trouble squaring the Error vs Exception, but you've pulled me from "seems like an Exception to me" into "crap, now I don't know" territory :) I think what makes me uncomfortable is that there are some enums that are _intended_ to be extended, such as java.lang.annotation.ElementType.? (In fact, we might be adding a new member soon; RECORD_COMPONENT.)? And I would want clients of ElementType to be aware that they never know all the element types, and code accordingly. Which suggests that enums needs a mechanism to either mark them as sealed (which turns on the enhanced exhaustiveness behavior) or as non-sealed (which would turn it off). On 4/19/2018 5:43 PM, Kevin Bourrillion wrote: > Necromancing, since I noticed that the spec still contains a hole > where this name would go. > > *Name:* > > * I think something specific like > UnexpectedEnumConstantE{rror,xception} would seem the right way to > go. (Perhaps "Unrecognized"?) > > *Hierarchy: * > > * It will want a common supertype it can share with the future > "unexpected subtype of sealed type" error/exception. > * As for where that supertype goes, I still maintain that this is > /exactly/ an IncompatibleClassChangeError (argument below), and > thus should be a subtype of that. I also see nothing harmed by it > being an Error instead of Exception. > > My claim is that releasing an enum with a certain set of constants is > qualitatively equivalent to releasing an interface with a certain set > of abstract methods. We know that people key behavior off of enums > (that's what enum switch is all about). That means that when we add a > constant, we are adding new /contract/, which we (the enum owners) > don't know how to fulfill. The call sites need to fulfill it. > > Thought experiment: I can already implement an interface in two > different ways: the normal way, or via a dynamic proxy that throws an > exception if it gets an unexpected method. Let's imagine that the > latter way was made exactly as easy to express as the former. I think > everyone would probably agree that most implementations would > /still/?choose the current behavior. (Yes?) They don't /want/?anything > to fail at runtime that could instead fail at compile-time. > > Anyway, all of this is just to support the notion that this should be > an IncompatibleClassChangeError. Of course, the argument's been made > in this thread that it /is/?different from an incompatible class > change. My response was that these reasons seem way too subtle to me. > Or, have I been persistently missing something? > > > > On Fri, Mar 30, 2018 at 11:31 AM, Kevin Bourrillion > wrote: > > On Fri, Mar 30, 2018 at 10:48 AM, Brian Goetz > > wrote: > > Backing way up, Alex had suggested that the right exception is > (a subtype of) IncompatibleClassChangeEXCEPTION, rather than > Error.? I was concerned that ICC* would seem too low-level to > users, though.? But you're saying ICCE and subtypes are > helpful to suers, because they guide users to "blame your > classpath".? SO in that case, is the ICC part a good enough > trigger? > > > (Just to be clear, Remi and I have been advocating for a subtype > of ICC*Error* all along, in case anyone missed that.) > > All right, I've been focusing too much on the hierarchy, but the > leaf-level name is more important than that (and the message text > further still, and since I assume we'll do a fine job of that, I > can probably relax a little). To answer your question, sure, the > "ICC" is a pretty decent signal. Have we discussed Cyrill's point > on -observers that we should create more specific exception types, > such as UnrecognizedEnumConstantE{rror,xception}? > > > For an enum in the same class/package/module as the switch, > the chance of getting the error at runtime is either zero > (same class) or effectively zero (same package or module), > because all sane developers build packages and modules in an > atomic operation. > > For an enum in a different module as the switch, the chance of > getting the error at runtime is nonzero, because we're linking > against a JAR at runtime. > > So an alternative here is to tweak the language so that the > "conclude exhaustiveness if all enum constants are present" > behavior should be reserved for the cases where the switch and > the enum are in the same module? > > (Just a thought.) > > > Okay, that is a sane approach, but I think it leaves too much of > the value on the floor. I often benefit from having my > exhaustiveness validated and being able to find out at compile > time if things change in the future. > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, > Inc.?|kevinb at google.com > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Apr 19 22:13:00 2018 From: john.r.rose at oracle.com (John Rose) Date: Thu, 19 Apr 2018 15:13:00 -0700 Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: <856ABF99-276C-4DAF-B92F-F9CB60A64C01@oracle.com> On Apr 19, 2018, at 1:44 PM, Brian Goetz wrote: > > The result is one switch construct, with modern and legacy flavors, which supports either expressions or statements. +10 Incrementally improving existing constructs is better, in this case (and usually) than piecemeal adding new-but-similar constructs to fill gaps. More subtly, I prefer more, smaller, independently exercisable syntax options, instead of fewer "omnibus" choices?a sushi menu instead of a chef's choice prix fixe menu with only a few low-information decisions. That tilts me towards {arrow,colon}x{expr,stmt} instead of a {arrow+expr, colon+stmt}, or {switch, match}. The choice not to mix arrow and colon, while more of a chef's-choice move, is OK with me; I see how requiring consistent colons or the arrows will help make switches more readable even if they grow large. It's important to disallow both fall-in and fall-out for arrow cases, so you can ignore the code immediately before an arrow case, and immediately after its statement. I anticipate "switching" to arrows as the preferred format for all of my switches, except for the small minority which for some odd reason need the very special expressiveness that comes from fallthrough. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Thu Apr 19 22:19:13 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 19 Apr 2018 15:19:13 -0700 Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: On Thu, Apr 19, 2018 at 1:44 PM, Brian Goetz wrote: And the reason we think that improving switch is the right primary goal is > because a "do one of these based on ..." construct is _better_ than the > corresponding chain of if-else-if, for multiple reasons: > > - Possibility for the compiler to do exhaustiveness analysis, potentially > finding more bugs; > - Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); > - More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > - Eliminates the need to repeat (and possibly get wrong) the switch > target. > #3 is a big one: the more constrained the construct, the quicker the code is to read and understand. There is less that it *might* be doing, so it's clearer what it *is* doing. Enums get even more benefits: unqualified constant names, getting to sidestep the `==`-or-equals() debate, and the special exhaustiveness stuff. Indeed, switch rocks. ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it. Let's say that _all_ switches can > support either old-style (colon) or new-style (arrow) case labels -- but > must stick to one kind of case label in a given switch: > Holy mackerel. I think this possibly gives us everything we wanted and then some. // also statement switch > switch (x) { > case 1 -> println("one"); > case 2 -> println("two"); > } > Can a single-statement case have a variable declaration as that statement, and what would be its scope? #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches. Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what they > want "for free", with no need to treat it specially. In fact, its even > better; in the all-arrow form, all of the things people hate about switch > -- the need to say break, the risk of fallthrough, and the questionable > scoping -- all go away. > These benefits are so great that you should stop describing them as accidental. :-) I think you're saying that *every* switch statement that doesn't require fall-through can always be expressed in arrowform. If that's right, I'm very happy. Please make this happen. :-) (I'll still try to think of flaws if I can.) -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 19 22:27:06 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 Apr 2018 18:27:06 -0400 Subject: [switch] Further unification on switch In-Reply-To: References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: <9e18707d-c1bf-068f-8d3c-73192eabf852@oracle.com> > > Can a single-statement case have a variable declaration as that > statement, and what would be its scope? No, a local variable declaration is a BlockStatement (JLS 14.2), not a Statement (JLS 14.5).? So you could say: ??? case FOO -> println(3); or ??? case FOO -> { ??????? int x = 4; ??????? println(x); ??? } Of course, you don't have to ask about the scope of x. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Apr 19 22:12:33 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 19 Apr 2018 18:12:33 -0400 Subject: [switch] Further unification on switch In-Reply-To: References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: <19CEEC7A-BD95-4C8E-84CC-70401C0BF679@oracle.com> > On Apr 19, 2018, at 6:19 PM, Kevin Bourrillion wrote: > . . . > Can a single-statement case have a variable declaration as that statement, and what would be its scope? My guess would be ?yes?, and all the same things would happen as for a local variable declaration statement that happens to be the last statement of a block?in particular, the variable bound by the last declarator has a scope so small that it cannot contain any references to that variable. ?Guy From guy.steele at oracle.com Thu Apr 19 22:14:26 2018 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 19 Apr 2018 18:14:26 -0400 Subject: [switch] Further unification on switch In-Reply-To: <9e18707d-c1bf-068f-8d3c-73192eabf852@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> <9e18707d-c1bf-068f-8d3c-73192eabf852@oracle.com> Message-ID: <60CBBBC5-6335-40AB-9303-7A1D82B90EC5@oracle.com> > On Apr 19, 2018, at 6:27 PM, Brian Goetz wrote: > > >> >> Can a single-statement case have a variable declaration as that statement, and what would be its scope? > > No, a local variable declaration is a BlockStatement (JLS 14.2), not a Statement (JLS 14.5). So you could say: > > case FOO -> println(3); > > or > > case FOO -> { > int x = 4; > println(x); > } > > Of course, you don't have to ask about the scope of x. Please ignore my previous message; I stand corrected. (I forgot about the distinction between Statement and a BlockStatement. Time for me to go home and get some supper.) ?Guy From kevinb at google.com Thu Apr 19 23:51:05 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 19 Apr 2018 16:51:05 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: On Thu, Apr 19, 2018 at 2:50 PM, Brian Goetz wrote: I like Un{recognized,known}EnumConstantE{rror,xception}. > (Don't leave out "Unexpected", which might be the best one. It's not about what I "know" because I "know" that value exists now. I just didn't know it before, so I didn't *expect* it. Maybe "Unrecognized" isn't quite right either, because it might seem like once you've seen a thing once at runtime you should "recognize" it after that.) When we get to sealed types, it will be the same but with something like > s/EnumConstant/SealedTypeMember/. > > I am still having trouble squaring the Error vs Exception, but you've > pulled me from "seems like an Exception to me" into "crap, now I don't > know" territory :) > > I think what makes me uncomfortable is that there are some enums that are > _intended_ to be extended, such as java.lang.annotation.ElementType. (In > fact, we might be adding a new member soon; RECORD_COMPONENT.) And I would > want clients of ElementType to be aware that they never know all the > element types, and code accordingly. > > Which suggests that enums needs a mechanism to either mark them as sealed > (which turns on the enhanced exhaustiveness behavior) or as non-sealed > (which would turn it off). > In theory, clients of ElementType who are "aware that they never know all of them" should still have just as much right to decide whether they *want to be broken* when there are new ones to handle or not. But.... I think I do finally understand, thanks to your example, what is different between this and the previous kinds of incompatible changes. The JDK (and some libraries) makes strong promises not to break compatibility. Yet we simply can't throw up our hands and refuse to add constants to enums like ElementType. So either we need a way to mark it unsealed, or we have to do some very fiddly messaging, like "well, it's binary-compatible, and it's also source-compatible *except *for any breakages you *opted into* via defaultless switch expressions." The trouble with that being that many developers won't have actually consciously opted into it at all. But *maybe* that is a viable option? Or, third alternative, could we just backpedal and issue *all* of these messages as warnings instead of errors, and not need to make a distinction between two kinds of enums? I worry that this may make the feature relatively useless. I think that warnings that can be introduced nonlocally are almost never addressed - most developers just don't page through screenfuls of -Xlint output to handle it all. Where they are useful is when they show up *in the code you are editing*, but this case would rarely work that way. This matters because if we were to go the all-warnings route then suddenly this really isn't an IncompatibleClassChangeError at all - maybe you simply ignored the warning. Might make it not even really an Error. > > On 4/19/2018 5:43 PM, Kevin Bourrillion wrote: > > Necromancing, since I noticed that the spec still contains a hole where > this name would go. > > *Name:* > > - I think something specific like UnexpectedEnumConstantE{rror,xception} > would seem the right way to go. (Perhaps "Unrecognized"?) > > *Hierarchy: * > > - It will want a common supertype it can share with the future > "unexpected subtype of sealed type" error/exception. > - As for where that supertype goes, I still maintain that this is > *exactly* an IncompatibleClassChangeError (argument below), and thus > should be a subtype of that. I also see nothing harmed by it being an Error > instead of Exception. > > My claim is that releasing an enum with a certain set of constants is > qualitatively equivalent to releasing an interface with a certain set of > abstract methods. We know that people key behavior off of enums (that's > what enum switch is all about). That means that when we add a constant, we > are adding new *contract*, which we (the enum owners) don't know how to > fulfill. The call sites need to fulfill it. > > Thought experiment: I can already implement an interface in two different > ways: the normal way, or via a dynamic proxy that throws an exception if it > gets an unexpected method. Let's imagine that the latter way was made > exactly as easy to express as the former. I think everyone would probably > agree that most implementations would *still* choose the current > behavior. (Yes?) They don't *want* anything to fail at runtime that could > instead fail at compile-time. > > Anyway, all of this is just to support the notion that this should be an > IncompatibleClassChangeError. Of course, the argument's been made in this > thread that it *is* different from an incompatible class change. My > response was that these reasons seem way too subtle to me. Or, have I been > persistently missing something? > > > > On Fri, Mar 30, 2018 at 11:31 AM, Kevin Bourrillion > wrote: > >> On Fri, Mar 30, 2018 at 10:48 AM, Brian Goetz >> wrote: >> >> Backing way up, Alex had suggested that the right exception is (a subtype >>> of) IncompatibleClassChangeEXCEPTION, rather than Error. I was >>> concerned that ICC* would seem too low-level to users, though. But you're >>> saying ICCE and subtypes are helpful to suers, because they guide users to >>> "blame your classpath". SO in that case, is the ICC part a good enough >>> trigger? >>> >> >> (Just to be clear, Remi and I have been advocating for a subtype of ICC >> *Error* all along, in case anyone missed that.) >> >> All right, I've been focusing too much on the hierarchy, but the >> leaf-level name is more important than that (and the message text further >> still, and since I assume we'll do a fine job of that, I can probably relax >> a little). To answer your question, sure, the "ICC" is a pretty decent >> signal. Have we discussed Cyrill's point on -observers that we should >> create more specific exception types, such as UnrecognizedEnumConstantE{rror >> ,xception}? >> >> >> For an enum in the same class/package/module as the switch, the chance of >>> getting the error at runtime is either zero (same class) or effectively >>> zero (same package or module), because all sane developers build packages >>> and modules in an atomic operation. >>> >>> For an enum in a different module as the switch, the chance of getting >>> the error at runtime is nonzero, because we're linking against a JAR at >>> runtime. >>> >>> So an alternative here is to tweak the language so that the "conclude >>> exhaustiveness if all enum constants are present" behavior should be >>> reserved for the cases where the switch and the enum are in the same >>> module? >>> >>> (Just a thought.) >>> >> >> Okay, that is a sane approach, but I think it leaves too much of the >> value on the floor. I often benefit from having my exhaustiveness validated >> and being able to find out at compile time if things change in the future. >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com >> > > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 20 00:27:52 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 Apr 2018 20:27:52 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> Message-ID: <362E5B57-EFBF-4922-9AE7-84601F65E62D@oracle.com> > I think I do finally understand, thanks to your example, what is different between this and the previous kinds of incompatible changes. The JDK (and some libraries) makes strong promises not to break compatibility. Yet we simply can't throw up our hands and refuse to add constants to enums like ElementType. So either we need a way to mark it unsealed, or we have to do some very fiddly messaging, like "well, it's binary-compatible, and it's also source-compatible except for any breakages you opted into via defaultless switch expressions." The trouble with that being that many developers won't have actually consciously opted into it at all. But maybe that is a viable option? Right. I think there are two kinds of enums, call them sealed and unsealed, in their promises about how expected or unexpected new constants will be. You?re right that either default ? sealed or unsealed ? has problems; if the default is sealed, then anyone who has written an intended-to-be-extended enum now has to go and mark it, and if the default is unsealed, their clients will be irritated. And worse, many maintainers will be wrong about their intent to add new enum constants in the future. Very few enums are as black-and-white as the extremes of `enum BOOL { TRUE, FALSE }` and `ElementType`. But I think there _are_ two cases, and its reasonable the language to look to a declaration-site cue as to whether to assume sealed-ness or not. (I might think differently tomorrow.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Apr 20 09:25:10 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 20 Apr 2018 10:25:10 +0100 Subject: [switch] Further unification on switch In-Reply-To: <856ABF99-276C-4DAF-B92F-F9CB60A64C01@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> <856ABF99-276C-4DAF-B92F-F9CB60A64C01@oracle.com> Message-ID: <44C053F0-526C-42B1-AE4E-C8E90586A333@oracle.com> > On 19 Apr 2018, at 23:13, John Rose wrote: > > > I anticipate "switching" to arrows as the preferred format for all of my > switches, except for the small minority which for some odd reason > need the very special expressiveness that comes from fallthrough. That?s what we always expected for switch expressions, but in this new unified world, you get the all--> form for switch statements too! The neat thing, if I may say so, is that this proposal layers the all?> form on top of the all-: form, so its really all about enhancement as opposed to a new construct. Writing revised spec as we speak - be ready! Gavin From john.r.rose at oracle.com Fri Apr 20 16:58:37 2018 From: john.r.rose at oracle.com (John Rose) Date: Fri, 20 Apr 2018 09:58:37 -0700 Subject: Expression switch exception naming In-Reply-To: <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> Message-ID: On Mar 28, 2018, at 12:48 PM, Brian Goetz wrote: > > Some incompatibilities are more of a fire drill than others. Binary incompatibilities (e.g., removing a method) are harder to recover from than unexpected inputs. Further, while there may be no good _local_ recover for an unexpected input, there often is a reasonable global recovery. Error means "fire drill". I claim this doesn't rise to the level of Error; it's more like NumberFormatException or NPE or ClassCastException. We want an unchecked throwable here. Given that it's a judgement call to classify as Error or RuntimeException. I agree with Brian that this condition does not rise to an error. The Error doc implies that errors are unrecoverable, by saying "a reasonable application should not try to catch" it. Does this mean that REs are more catchable? I guess, but what an RE says (to me, at least) is that a basic operation of the JVM or language is being used in a partial mode, an unexpected input has been presented to the operation, and the *normal* operation of the JVM is to reject the unexpected input. This is the case with CCE, NPE, ASE, and many other REs. If you buy this, then it follows that when a programmer gets a RE, he has a choice to (a) fix the input source, (b) extend the operation code to handle the unexpected input value, or (c) wrap a catch around the thing and do something semi-locally. None of these options are appropriate to an Error. So, a surprise enum value looks like an ICCE, yes, because it happens only when surprise recompilations occur. But that doesn't mean it must be an Error, because users can make any of the responses (a/b/c) above to it, so the mitigation actions are characteristic of an RE. One thing that tips the balance for me is remembering that ICCE-like conditions can reasonably manifest as REs. For example, if I refactor a class in its hierarchy (akin to changing an enum list), clients of that class might fail with CCE as they change their view of the moved class. Such CCEs often come from the translation strategy of erased generics, which embodies compile time type expectations in checkcast instructions. It seems reasonable to handle enum exhaustiveness (and eventually sealed hierarchy exhaustiveness) using an RE something like CCE. So I buy Brian's argument that this is not an Error but a RE. I bought it on a hunch, and the above reasoning seems to bear it out under morescrutiny. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Fri Apr 20 17:04:41 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 20 Apr 2018 11:04:41 -0600 Subject: Expression switch exception naming In-Reply-To: <362E5B57-EFBF-4922-9AE7-84601F65E62D@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <362E5B57-EFBF-4922-9AE7-84601F65E62D@oracle.com> Message-ID: <6092DD86-D37B-46E3-B9FC-1B3C5CAF0FC3@oracle.com> > On Apr 19, 2018, at 6:27 PM, Brian Goetz wrote: > >> I think I do finally understand, thanks to your example, what is different between this and the previous kinds of incompatible changes. The JDK (and some libraries) makes strong promises not to break compatibility. Yet we simply can't throw up our hands and refuse to add constants to enums like ElementType. So either we need a way to mark it unsealed, or we have to do some very fiddly messaging, like "well, it's binary-compatible, and it's also source-compatible except for any breakages you opted into via defaultless switch expressions." The trouble with that being that many developers won't have actually consciously opted into it at all. But maybe that is a viable option? > > Right. I think there are two kinds of enums, call them sealed and unsealed, in their promises about how expected or unexpected new constants will be. You?re right that either default ? sealed or unsealed ? has problems; if the default is sealed, then anyone who has written an intended-to-be-extended enum now has to go and mark it, and if the default is unsealed, their clients will be irritated. And worse, many maintainers will be wrong about their intent to add new enum constants in the future. Very few enums are as black-and-white as the extremes of `enum BOOL { TRUE, FALSE }` and `ElementType`. > > But I think there _are_ two cases, and its reasonable the language to look to a declaration-site cue as to whether to assume sealed-ness or not. (I might think differently tomorrow.) What I found in reviewing enums in the JDK is that it's pretty hard to find enums you can _guarantee_ will never grow. java.time.Month, sure. But what about Thread.State? StandardOpenOption? System.Logger.Level? Am I prepared to guarantee that we'll never have a reason to add another case? So some sort of declaration-site feature would, I fear, either get little use or be routinely ignored when changes must be made. ?Dan From daniel.smith at oracle.com Fri Apr 20 17:15:10 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 20 Apr 2018 11:15:10 -0600 Subject: Feedback wanted: switch expression typing In-Reply-To: References: Message-ID: <16FAD24B-5AC0-492B-BBFA-EAD54BFC2790@oracle.com> > On Mar 28, 2018, at 1:37 PM, Dan Smith wrote: > > At this point, we've got a choice: > A) Fully mimic the conditional behavior in switch expressions > B) Do target typing (when available) for all switch expressions, diverging from conditionals > C) Do target typing (when available) for all switches and conditionals, accepting the incompatibilities The consensus of this thread seemed to be (B) or (C), which means we know what we want to do for switch expressions, and can spin off potential changes to conditionals as a separate task. (TODO: To address that task, I may drill deeper into usage statistics and refine my scanning tool.) To formalize the typing rules for switch expressions, here's a proposed specification. Some notes: - The goal is fidelity with conditional expressions: a standalone switch expression has the same type as an equivalent standalone conditional expression. Also, of course, the result must not be order-dependent for cases with more than 2 result expressions (test case: (int, double, Comparable)). - A proposed new rule that hasn't been previously discussed: to facilitate typing, there must be at least one result expression. (For example, all cases could throw, and this would be a compiler error.) - We'll need to identify all the places that poly conditional expressions get special treatment (overload resolution, inference, conditional expression classification, ...) and update them to support switch expressions too. ------------ A switch expression is a poly expression if it appears in an assignment context or an invocation context ([5.2], [5.3]). Otherwise, it is a standalone expression. The _result expressions_ of a switch expression are ... It is a compile-time error if a switch expression has no result expressions. [Design note: without at least one result expression, there's not a particularly good choice for the standalone type of the expression. It could still work as a poly expression, but for consistency it's probably better to reject the expression in all contexts.] Where a poly switch expression appears in a context of a particular kind with target type _T_, its result expressions similarly appear in a context of the same kind with target type _T_. A poly switch expression is compatible with a target type _T_ if each of its result expressions is compatible with _T_. The type of a poly switch expression is the same as its target type. The type of a standalone switch expression is determined as follows: - If the result expressions all have the same type (which may be the null type), then that is the type of the switch expression. - Otherwise, if the type of each result expression is `boolean` or `Boolean`, unboxing conversion ([5.1.8]) is applied to each result expression of type `Boolean`, and the switch expression has type `boolean`. - Otherwise, if the type of each result expression is convertible to a numeric type ([5.1.8]), the type of the switch expression is given by conditional numeric promotion ([5.6.3]) applied to the result expressions. - Otherwise, boxing conversion ([5.1.7]) is applied to each result expression that has a primitive type, after which the type of the switch expression is the least upper bound ([4.10.4]) of the types of the result expressions. ------ 5.6.3 Conditional numeric promotion [Design note: this introduces a new n-ary variety of numeric promotion. The rules for conditional expression typing would be revised to refer to it. I'm not sure what to call it, don't love this name but it's what I've come up with for now. Another approach?more invasive but perhaps cleaner?would be to use this for _all_ numeric promotions, and ask clients to specify if they want `int` as a "minimum type" (as is the case for all primitive operators).] When a conditional expression or switch expression applies conditional numeric promotion to a set of result expressions, each of which must denote a value that is convertible to a numeric type, the following rules apply, in order: - If any result expression is of a reference type, it is subjected to unboxing conversion. - Widening primitive conversion and narrowing primitive conversion are applied to some result expressions as specified by the follow rules: - If any result expression is of type double, the others are widened, as necessary, to double. - Otherwise, if any result expression is of type float, the others are widened, as necessary, to float. - Otherwise, if any result expression is of type long, the others are widened, as necessary, to long. - Otherwise, if any result expression is of type int and is not a constant expression, the others are widened, as necessary, to int. - Otherwise, if any result expression is of type char, and every other result expression is either of type char, or of type byte, or a constant expression of type int with a value that is representable in the type char, then the byte results are widened to char and the int results are narrowed to char. - Otherwise, if any result expression is of type short, and every other result expression is either of type short, or of type byte, or a constant expression of type int with a value that is representable in the type short, then the byte results are widened to short and the int results are narrowed to short. - Otherwise, if any result expression is of type byte, and every other result expression is either of type byte or a constant expression of type int with a value that is representable in the type byte, then the int results are narrowed to byte. - Otherwise, all the results are widened, as necessary, to int. After the conversion(s), if any, value set conversion ([5.1.13]) is then applied to each operand. From brian.goetz at oracle.com Fri Apr 20 17:26:58 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 Apr 2018 13:26:58 -0400 Subject: Expression switch exception naming In-Reply-To: <6092DD86-D37B-46E3-B9FC-1B3C5CAF0FC3@oracle.com> References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <362E5B57-EFBF-4922-9AE7-84601F65E62D@oracle.com> <6092DD86-D37B-46E3-B9FC-1B3C5CAF0FC3@oracle.com> Message-ID: I am totally unsurprised to see this observation.? And, I also think that if we did declare some of them sealed, on the basis of being unable to imagine that more would be added, we'd be wrong as often as right, other than the obvious extremes of Month and ElementType. Which puts us in a bind. That said, in light of the recent leap from UA1 to UA2, I am starting to question the wisdom of assuming that an enum is sealed. In light of morning, it feels more in the category of "picking winning idioms" rather than composing behavior from first principles. >> Right. I think there are two kinds of enums, call them sealed and unsealed, > What I found in reviewing enums in the JDK is that it's pretty hard to find enums you can _guarantee_ will never grow. java.time.Month, sure. But what about Thread.State? StandardOpenOption? System.Logger.Level? Am I prepared to guarantee that we'll never have a reason to add another case? > > So some sort of declaration-site feature would, I fear, either get little use or be routinely ignored when changes must be made. > From kevinb at google.com Fri Apr 20 17:37:13 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 20 Apr 2018 10:37:13 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: On Wed, Apr 18, 2018 at 11:16 AM, Kevin Bourrillion wrote: If one of the patterns is a constant expression or enum constant that is >> equal to the value of the selector expression, then we say that the pattern >> *matches*. >> > > I think "equal" is ambiguous for strings (and will be for doubles when > they happen). > Whoops: I belatedly noticed that we *are* expanding to all primitive types already. Okay. So, unless I missed it somewhere, I think we need to specify that all comparisons are made *as if by equals()* on the (boxed) type (or whatever better way to word it). We might even want to call out the specific consequences that case NaN: works as expected, and that 0.0 and -0.0 are fully distinct despite being ==. But here is the real wrench I want to throw in the works... (and I apologize, as always, if I am unknowingly rehashing old decisions that were already finalized): The most compelling reason to support float and double switches is the fact that pattern matching will automatically cover them via boxing anyway. If it were not for that, I believe it is a feature with too much risk to be useful. case 0.1: simply does not mean what* any* developer would wish it to mean. At Google we spend real effort trying to get our developers to depend *less* on exact floating-point equality, not more. So, all I'm asking is: can we make this particular change atomically with patterns itself, not before? I believe that the change has negative value until then because it is too easy to use it to write bugs. (If this means that long and boolean also wait until then, I don't think anyone would really mind. If necessary I can look up how common simulated-switch-on-long happens in our codebase, but we all know it won't be much.) ~~ Separately but similarly, the merits of case null: have also been justified almost entirely in the context of patterns. Without patterns, I believe the benefits are far too slight. We studied six digits' worth of switch statements in the Google codebase, using a *liberal* interpretation of whether they are simulating a null case, and came up with ... 2.4%. (You will find that suspicious as hell, since it's the exact same percentage I cited for fall-through yesterday, but I swear it's a coincidence!) Worse than the benefit being small is the fact that compatibility is forcing us to make a concession in its behavior that I am sure we would *never* make for a freshly designed feature. What we will be saying is: " default means default... except when it doesn't." It also will force us to support the construct default, case null: in the language, which we would never need for any other reason. (Occasionally people *do* list default together with other case labels, but they never *need* to.) Another problem, which is *much* smaller, is the "moral hazard" of possibly moving the needle back toward null-friendly programming - more than just permitting null, but actually ascribing particular *meaning* to null, which is well understood to be a code smell. Eh, I almost didn't mention this, but thought it should at least be considered. So, separately but similarly, I also ask: can we please delay this change until such time as it is (tragically) forced by patterns? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 20 17:45:12 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 Apr 2018 13:45:12 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> > So, all I'm asking is: can we make this particular change atomically > with patterns itself, not before? I believe that the change has > negative value until then because it is too easy to use it to write > bugs. (If this means that long?and booleanalso wait until then, I > don't think anyone would really mind. If necessary I can look up how > common simulated-switch-on-longhappens in our codebase, but we all > know it won't be much.) The extra primitive types are separable, so could be deferred.? I'd be less sanguine about adding long now but not float. > Separately but similarly, the merits of case null: have also been > justified almost entirely in the context of patterns. Without > patterns, I believe the benefits are far too slight. We studied six > digits' worth of switch statements in the Google codebase, using a > /liberal/ interpretation of whether they are simulating a null case, > and came up with ... 2.4%.? (You will find that suspicious as hell, > since it's the exact same percentage I cited for fall-through > yesterday, but I swear it's a coincidence!) More nervous about this.? Would rather start the education curve on this earlier. And there are plenty of existing switches that are wrapped with "if target != null" that would be clearer/have less needless repetition by pushing the case null into the switch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 20 17:50:33 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 Apr 2018 13:50:33 -0400 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <2125955809.1367233.1522250385521.JavaMail.zimbra@u-pem.fr> <27798a9d-d88f-8265-3c22-337d6d07bcb1@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> Message-ID: > So I buy Brian's argument that this is not an Error but a RE. > I bought it on a hunch, and the above reasoning seems to > bear it out under morescrutiny. > After further discussion, the Error-vs-Exception debate here was really a proxy for a more important question, which is: are we justified in assuming that enums are effectively sealed, and using that in the semantics of switch.? So far none of the answers are all that great. Clearly, we're not adding any more booleans, so a switch that covers true and false is exhaustive.? We're pretty comfortable with the idea that boolean is sealed.? (You could make the same argument for byte.) For an explicitly sealed type, I buy it's an ICCE when an unexpected subtype shows up.? The author said "these are the only three subtypes", so a fourth is definitely cause for worry that your configuration is borked. Enums are in the middle.? We'd like to behave as if they're sealed, but they're not, and for some enums, they really are intended to be not sealed.? (The "language version" enum in javax.lang.model or the API version enum in ASM are obvious examples.) Here's an idea that was pretty unpopular when I first brought it up, but now that people see the problem more, might offer more light on the problem, which is, to consider boundaries. If I have a class: ??? class C { ??????? enum E { A,B; } ?????? ... switch (e) ... ?? } its pretty clear that a switch that covers A and B is exhaustive; I can't add cases without recompiling C.? Here, the user would surely want us to infer sealed-ness for E. If I have two classes in the same _package_ ... can I make the same assumption?? I think so.? Packages are intended to be compiled as a unit.? In fact, _modules_ are intended to be compiled as a unit. The assumption of sealed-ness for enums used within a package or module is pretty reasonable. Its when we get to uses across module boundaries that inferring sealed-ness in the absence of an explicit annotation to that effect is questionable. Now, no one is comfortable with the idea of using package or module boundaries here, but ultimately, the problem is that the validity of the assumption of sealed-ness is dependent on boundaries. From guy.steele at oracle.com Fri Apr 20 17:48:49 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 20 Apr 2018 13:48:49 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <4709FB11-1DE4-4CDB-8A75-A3FFE3160698@oracle.com> > On Apr 20, 2018, at 1:37 PM, Kevin Bourrillion wrote: > > On Wed, Apr 18, 2018 at 11:16 AM, Kevin Bourrillion > wrote: > > If one of the patterns is a constant expression or enum constant that is equal to the value of the selector expression, then we say that the patternmatches. > > I think "equal" is ambiguous for strings (and will be for doubles when they happen). > > Whoops: I belatedly noticed that we are expanding to all primitive types already. Okay. So, unless I missed it somewhere, I think we need to specify that all comparisons are made as if by equals() on the (boxed) type (or whatever better way to word it). We might even want to call out the specific consequences that case NaN: works as expected, and that 0.0 and -0.0 are fully distinct despite being ==. > > But here is the real wrench I want to throw in the works... (and I apologize, as always, if I am unknowingly rehashing old decisions that were already finalized): > > The most compelling reason to support float and double switches is the fact that pattern matching will automatically cover them via boxing anyway. If it were not for that, I believe it is a feature with too much risk to be useful. case 0.1: simply does not mean what any developer would wish it to mean. At Google we spend real effort trying to get our developers to depend less on exact floating-point equality, not more. I agree that it would (almost always) be insane to write `case 0.1:`. However, it may actually be extremely attractive for some purposes to write something like: switch (x) { case NaN -> foo; case Double.NEGATIVE_INFINITY -> bar; case Double.POSITIVE_INFINITY -> baz; case +0.0 -> quux; case -0.0 -> ztesch; default -> frobboz(x); } and I can even imagine `case 1.0 ->` or `case 2.0 ->` sneaking in as special cases on occasion. You don?t always want to write library code this way?rather, analysis may show that `frobboz` computes the desired results for some or all of the special cases. But when code clarity is more important than that last ounce of speed, this is a very clear way to say what you want. Seems to me that the compiler could certainly warn about case labels such as 0.1 that suffer rounding during the decimal-to-binary conversion. Or about any category of cases label we wish to denigrate. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Apr 20 18:06:29 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 20 Apr 2018 11:06:29 -0700 Subject: Expression switch exception naming In-Reply-To: References: <0CB0D6F1-83AF-4C91-8A86-77BB8201DF67@oracle.com> <9d571eca-1628-a644-cc92-d50229b353d2@oracle.com> <2a83d8fb-df1d-a523-3399-44e1c4b5b891@oracle.com> <362E5B57-EFBF-4922-9AE7-84601F65E62D@oracle.com> <6092DD86-D37B-46E3-B9FC-1B3C5CAF0FC3@oracle.com> Message-ID: That's useful information, Dan, and I agree that it is unsurprising. To keep harping on the comparison to interface methods, it has been just the same with them. Many interfaces have been natural candidates to grow over time, and it was a bummer that they couldn't. (Then in 8 they acquired a way to attach their own default behavior, but there the analogy runs out of gas - enums could never really have that.) But here's the thing. For those sealed enums that never grow, this whole feature never really has that much value anyway. All it accomplishes is that it helps me avoid a simple mistake of leaving one off, and lets me skip an annoying pointless `default:`. What it does for the enum that grows is so much more valuable: I gain the opportunity to fix my code to do the right thing, before it fails at runtime. That's a good thing. I'm beginning to feel more convinced that the least-bad solution is just to massage the definition of what source compatibility really means. Because we can consider this breakage to be "opt-in", a project that commits to compatibility should *still* be free to grow any enum. And I don't think we even need a sealed distinction. On Fri, Apr 20, 2018 at 10:26 AM, Brian Goetz wrote: > I am totally unsurprised to see this observation. And, I also think that > if we did declare some of them sealed, on the basis of being unable to > imagine that more would be added, we'd be wrong as often as right, other > than the obvious extremes of Month and ElementType. Which puts us in a bind. > > That said, in light of the recent leap from UA1 to UA2, I am starting to > question the wisdom of assuming that an enum is sealed. In light of > morning, it feels more in the category of "picking winning idioms" rather > than composing behavior from first principles. > > Right. I think there are two kinds of enums, call them sealed and >>> unsealed, >>> >> What I found in reviewing enums in the JDK is that it's pretty hard to >> find enums you can _guarantee_ will never grow. java.time.Month, sure. But >> what about Thread.State? StandardOpenOption? System.Logger.Level? Am I >> prepared to guarantee that we'll never have a reason to add another case? >> >> So some sort of declaration-site feature would, I fear, either get little >> use or be routinely ignored when changes must be made. >> >> > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Apr 20 18:36:17 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 20 Apr 2018 11:36:17 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> Message-ID: On Fri, Apr 20, 2018 at 10:45 AM, Brian Goetz wrote: > So, all I'm asking is: can we make this particular change atomically with > patterns itself, not before? I believe that the change has negative value > until then because it is too easy to use it to write bugs. (If this means > that long and boolean also wait until then, I don't think anyone would > really mind. If necessary I can look up how common simulated-switch-on- > long happens in our codebase, but we all know it won't be much.) > > The extra primitive types are separable, so could be deferred. I'd be > less sanguine about adding long now but not float. > Agreed, it would seem weird to keep adding more piecemeal over and over. Separately but similarly, the merits of case null: have also been justified > almost entirely in the context of patterns. Without patterns, I believe the > benefits are far too slight. We studied six digits' worth of switch > statements in the Google codebase, using a *liberal* interpretation of > whether they are simulating a null case, and came up with ... 2.4%. (You > will find that suspicious as hell, since it's the exact same percentage I > cited for fall-through yesterday, but I swear it's a coincidence!) > > More nervous about this. Would rather start the education curve on this > earlier. And there are plenty of existing switches that are wrapped with > "if target != null" that would be clearer/have less needless repetition by > pushing the case null into the switch. > Er - just clarifying that this is the *same* 2.4% that I am referring to. Of course, numbers will vary (and I concede that we are quite toward the null-hostile end of the spectrum in our general dev practices). Still, I'm sure we would not be making this change for this reason alone, so it really is about this issue of "starting the education curve earlier". Trying to figure out how much that matters. For what it's worth, Guava took the position at the start that, since working with null is risky and problematic, it's *okay* if code that deals with null is uglier than code that doesn't. It's only natural, so we don't bend over backwards to try to smooth it over. If that decision has played *some* *small* part in helping shift the world away from rampant overuse of null everywhere, we wouldn't regret it a bit. I think JDK collections post-1.4 could say the same thing to a larger degree. Okay, I guess this is just the "moral hazard" argument stated a different way - sorry. (Full disclosure: if you accuse me of wanting more time before `case null:` lands just so I have more time to try to talk us out of it completely, I suppose have no defense to that. :-)) -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 20 18:40:53 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 Apr 2018 14:40:53 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> Message-ID: <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> One thing that is relevant to the short term is that now that we killed mixed labels, we'd have to have a way to say "case null or default" in arrow world.? The least stupid thing seems to be to allow default to be tacked on to a comma-separated case list as if it were a pattern: ??? case A -> s1; ??? case null, default -> s2; since you can no longer say: ??? case A -> s1; ??? case null: ??? default: ??????? s2; On 4/20/2018 2:36 PM, Kevin Bourrillion wrote: > On Fri, Apr 20, 2018 at 10:45 AM, Brian Goetz > wrote: > >> So, all I'm asking is: can we make this particular change >> atomically with patterns itself, not before? I believe that the >> change has negative value until then because it is too easy to >> use it to write bugs. (If this means that long?and booleanalso >> wait until then, I don't think anyone would really mind. If >> necessary I can look up how common >> simulated-switch-on-longhappens in our codebase, but we all know >> it won't be much.) > The extra primitive types are separable, so could be deferred.? > I'd be less sanguine about adding long now but not float. > > > Agreed, it would seem weird to keep adding more piecemeal over and over. > >> Separately but similarly, the merits of case null: have also been >> justified almost entirely in the context of patterns. Without >> patterns, I believe the benefits are far too slight. We studied >> six digits' worth of switch statements in the Google codebase, >> using a /liberal/ interpretation of whether they are simulating a >> null case, and came up with ... 2.4%.? (You will find that >> suspicious as hell, since it's the exact same percentage I cited >> for fall-through yesterday, but I swear it's a coincidence!) > More nervous about this.? Would rather start the education curve > on this earlier. And there are plenty of existing switches that > are wrapped with "if target != null" that would be clearer/have > less needless repetition by pushing the case null into the switch. > > > Er - just clarifying that this is the /same/ 2.4% that I am referring > to. Of course, numbers will vary (and I concede that we are quite > toward the null-hostile end of the spectrum in our general dev > practices). Still, I'm sure we would not be making this change for > this reason alone, so it really is about this issue of "starting the > education curve earlier". Trying to figure out how much that matters. > > For what it's worth, Guava took the position at the start that, since > working with null is risky and problematic, it's /okay/ if code that > deals with null is uglier than code that doesn't. It's only natural, > so we don't bend over backwards to try to smooth it over. If that > decision has played /some/ /small/ part in helping shift the world > away from rampant overuse of null everywhere, we wouldn't regret it a > bit. I think JDK collections post-1.4 could say the same thing to a > larger degree. Okay, I guess this is just the "moral hazard" argument > stated a different way - sorry. > > (Full disclosure: if you accuse me of wanting more time before `case > null:` lands just so I have more time to try to talk us out of it > completely, I suppose have no defense to that. :-)) > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Apr 20 18:49:31 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 20 Apr 2018 11:49:31 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: I was proposing `default, case null` above, in order to keep `default` parallel with `case` instead of appearing like it can be subordinate to it. Put another way `case null, default` looks like you can remove the `null, ` from it, just as you could in other labels. However, once that's allowed, then people would automatically try `case null, default` anyway. Whichever way we wanted, they would try both ways. This small mess constitutes one of the valid arguments (to me) for delaying this sad change as long as possible. :-) However, I admit I still need to make progress understanding exactly how big the *upside* is in the world of patterns. You've explained it patiently a few times and I probably just need to reread it. On Fri, Apr 20, 2018 at 11:40 AM, Brian Goetz wrote: > One thing that is relevant to the short term is that now that we killed > mixed labels, we'd have to have a way to say "case null or default" in > arrow world. The least stupid thing seems to be to allow default to be > tacked on to a comma-separated case list as if it were a pattern: > > case A -> s1; > case null, default -> s2; > > since you can no longer say: > > case A -> s1; > case null: > default: > s2; > > > On 4/20/2018 2:36 PM, Kevin Bourrillion wrote: > > On Fri, Apr 20, 2018 at 10:45 AM, Brian Goetz > wrote: > >> So, all I'm asking is: can we make this particular change atomically with >> patterns itself, not before? I believe that the change has negative value >> until then because it is too easy to use it to write bugs. (If this means >> that long and boolean also wait until then, I don't think anyone would >> really mind. If necessary I can look up how common simulated-switch-on- >> long happens in our codebase, but we all know it won't be much.) >> >> The extra primitive types are separable, so could be deferred. I'd be >> less sanguine about adding long now but not float. >> > > Agreed, it would seem weird to keep adding more piecemeal over and over. > > Separately but similarly, the merits of case null: have also been >> justified almost entirely in the context of patterns. Without patterns, I >> believe the benefits are far too slight. We studied six digits' worth of >> switch statements in the Google codebase, using a *liberal* >> interpretation of whether they are simulating a null case, and came up with >> ... 2.4%. (You will find that suspicious as hell, since it's the exact >> same percentage I cited for fall-through yesterday, but I swear it's a >> coincidence!) >> >> More nervous about this. Would rather start the education curve on this >> earlier. And there are plenty of existing switches that are wrapped with >> "if target != null" that would be clearer/have less needless repetition by >> pushing the case null into the switch. >> > > Er - just clarifying that this is the *same* 2.4% that I am referring to. > Of course, numbers will vary (and I concede that we are quite toward the > null-hostile end of the spectrum in our general dev practices). Still, I'm > sure we would not be making this change for this reason alone, so it really > is about this issue of "starting the education curve earlier". Trying to > figure out how much that matters. > > For what it's worth, Guava took the position at the start that, since > working with null is risky and problematic, it's *okay* if code that > deals with null is uglier than code that doesn't. It's only natural, so we > don't bend over backwards to try to smooth it over. If that decision has > played *some* *small* part in helping shift the world away from rampant > overuse of null everywhere, we wouldn't regret it a bit. I think JDK > collections post-1.4 could say the same thing to a larger degree. Okay, I > guess this is just the "moral hazard" argument stated a different way - > sorry. > > (Full disclosure: if you accuse me of wanting more time before `case > null:` lands just so I have more time to try to talk us out of it > completely, I suppose have no defense to that. :-)) > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Apr 20 18:55:56 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 20 Apr 2018 14:55:56 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: You know, if s2 is short (say, less than 30 or 40 characters), there are worse things than writing case A -> s1; case null -> s2; default -> s2; especially if you use spaces (as I just did) to line up the two occurrences of s2 to make it easy to see they are identical. And if s2 is long, there are worse things than making a little sub method to handle it: case A -> s1; case null -> frobboz(a, b); default -> frobboz(a, b); int frobboz(int a, String b) { ? } And if even THAT is not satisfactory, well, there are worse things than giving up on the arrows and just using colons (and break, if needed). Yeah, null makes things uglier, but at least you have your choice of three different kinds of ugly. _____________________________________________________________________________________________________ BUT, on the other hand, if we wanted to: instead of, or in addition to, case pat1, pat2, pat3 -> s; we could allow the form case pat1 -> case pat2 -> case pat3 -> s; which of course could be stacked vertically for visual graciousness and perspicuity: case pat1 -> case pat2 -> case pat3 -> s; and such a format would clearly accommodate case A -> s1; case null -> default -> s2; Con: Could look like a programming error (unintentionally omitted statement), but that?s also true for the colon forms already permitted. Con: More verbose than the comma-separated form `case pat1, pat2, pat3 ->`, which may matter for smallish switch expressions. Pro: Doesn?t stick `default` in a weird place, or otherwise make a special rule just to handle ?default and null?. Pro: The keyword `case` appears in front of EVERY individual pattern, making them easier to see. Pro: Avoids possible confusion between `case a,b,c ->` and `case (a,b,c) ->`. Motto: ?It?s not fallthrough, it?s just a SwitchBlockStatementGroup.? > On Apr 20, 2018, at 2:40 PM, Brian Goetz wrote: > > One thing that is relevant to the short term is that now that we killed mixed labels, we'd have to have a way to say "case null or default" in arrow world. The least stupid thing seems to be to allow default to be tacked on to a comma-separated case list as if it were a pattern: > > case A -> s1; > case null, default -> s2; > > since you can no longer say: > > case A -> s1; > case null: > default: > s2; -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Apr 20 18:58:48 2018 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 20 Apr 2018 14:58:48 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: <1ADDB99F-C814-4AAA-82E4-1D454C6612D3@oracle.com> > On Apr 20, 2018, at 2:55 PM, Guy Steele wrote: > . . . > Motto: ?It?s not fallthrough, it?s just a SwitchBlockStatementGroup.? So the syntactic explanation for switch expressions would generalize from "case a -> s;" means "case a: break s;? to "case a -> case b -> ? case z -> s;" means "case a: case b: ? case z: break s;? From dl at cs.oswego.edu Sun Apr 22 12:29:39 2018 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 22 Apr 2018 08:29:39 -0400 Subject: [records] Ancillary fields In-Reply-To: References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> Message-ID: <5fc0bc8e-f160-569d-3d0b-ef65362b240c@cs.oswego.edu> On 04/18/2018 01:58 PM, Brian Goetz wrote: > Seeing no dissent on the claim that the essential use case for ancillary > fields is caching derived properties, No dissent, but there is a small leap from here to your proposal, that addresses only derived initial local properties. Which may be OK. More broadly, these cases fall under uses of Memoization (https://en.wikipedia.org/wiki/Memoization) in which caches are never evicted. And in turn monotonic predicates (that never become false once set true) and/or their ancillary data. But short of other mutable monotonic data (that are non-decreasing wrt some ordering). The case of initial local properties is common, and may be deserving of syntax support to increase chances of correct implementation. But I'm still bothered by lack of a story about any of the other cases. Maybe the "story" is a tutorial rather than further syntax though. -Doug From brian.goetz at oracle.com Sun Apr 22 13:15:29 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 22 Apr 2018 15:15:29 +0200 Subject: [records] Ancillary fields In-Reply-To: <5fc0bc8e-f160-569d-3d0b-ef65362b240c@cs.oswego.edu> References: <75ded733-0a75-b72c-2990-d418e87edb40@oracle.com> <5fc0bc8e-f160-569d-3d0b-ef65362b240c@cs.oswego.edu> Message-ID: <3A8A0007-BF7F-4CA5-BB9D-4A704246179F@oracle.com> Yes, you caught me :) The lazy fields story made sense to me here because (a) it makes enough sense on its own and (b) it seems just enough to avoid the most common cases of ?dumb records just aren?t enough?. But it it is, to a degeee, a patch. Sent from my MacBook Wheel > On Apr 22, 2018, at 2:29 PM, Doug Lea
wrote: > >> On 04/18/2018 01:58 PM, Brian Goetz wrote: >> Seeing no dissent on the claim that the essential use case for ancillary >> fields is caching derived properties, > > No dissent, but there is a small leap from here to your proposal, > that addresses only derived initial local properties. Which may be OK. > More broadly, these cases fall under uses of Memoization > (https://en.wikipedia.org/wiki/Memoization) in which caches are > never evicted. > And in turn monotonic predicates (that never become false once set true) > and/or their ancillary data. But short of other mutable monotonic data > (that are non-decreasing wrt some ordering). > > The case of initial local properties is common, and may be deserving > of syntax support to increase chances of correct implementation. > But I'm still bothered by lack of a story about any of the other cases. > Maybe the "story" is a tutorial rather than further syntax though. > > -Doug > From kevinb at google.com Mon Apr 23 18:02:30 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 23 Apr 2018 11:02:30 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: On Fri, Apr 20, 2018 at 11:55 AM, Guy Steele wrote: You know, if s2 is short (say, less than 30 or 40 characters), there are > worse things than writing > > case A -> s1; > case null -> s2; > default -> s2; > > especially if you use spaces (as I just did) to line up the two > occurrences of s2 to make it easy to see they are identical. > > And if s2 is long, there are worse things than making a little sub method > to handle it: > > case A -> s1; > case null -> frobboz(a, b); > default -> frobboz(a, b); > > int frobboz(int a, String b) { ? } > > And if even THAT is not satisfactory, well, there are worse things than > giving up on the arrows and just using colons (and break, if needed). > I think neither of these goes down well. Having to repeat yourself at all, while normal cases get to use comma, will feel very wrong. Having to abandon arrowform over this would be even worse. BUT, on the other hand, if we wanted to: instead of, or in addition to, > > case pat1, pat2, pat3 -> s; > > we could allow the form > > case pat1 -> case pat2 -> case pat3 -> s; > This seems like a step backward to me (whether it is instead or in addition). fwiw, I think `default, case null ->` is superior to all of these options. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Apr 23 18:29:28 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 23 Apr 2018 20:29:28 +0200 (CEST) Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> I agree with Kevin, case pat1 -> case pat2 -> s is too close to case pat1 -> pat2 -> s which has a very different meaning (the result is a lambda). R?mi > De: "Kevin Bourrillion" > ?: "Guy Steele" > Cc: "amber-spec-experts" > Envoy?: Lundi 23 Avril 2018 20:02:30 > Objet: Re: JEP325: Switch expressions spec > On Fri, Apr 20, 2018 at 11:55 AM, Guy Steele < [ mailto:guy.steele at oracle.com | > guy.steele at oracle.com ] > wrote: >> You know, if s2 is short (say, less than 30 or 40 characters), there are worse >> things than writing >> case A -> s1; >> case null -> s2; >> default -> s2; >> especially if you use spaces (as I just did) to line up the two occurrences of >> s2 to make it easy to see they are identical. >> And if s2 is long, there are worse things than making a little sub method to >> handle it: >> case A -> s1; >> case null -> frobboz(a, b); >> default -> frobboz(a, b); >> int frobboz(int a, String b) { ? } >> And if even THAT is not satisfactory, well, there are worse things than giving >> up on the arrows and just using colons (and break, if needed). > I think neither of these goes down well. Having to repeat yourself at all, while > normal cases get to use comma, will feel very wrong. Having to abandon > arrowform over this would be even worse. >> BUT, on the other hand, if we wanted to: instead of, or in addition to, >> case pat1, pat2, pat3 -> s; >> we could allow the form >> case pat1 -> case pat2 -> case pat3 -> s; > This seems like a step backward to me (whether it is instead or in addition). > fwiw, I think `default, case null ->` is superior to all of these options. > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 23 18:20:22 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 14:20:22 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: > On Apr 23, 2018, at 2:02 PM, Kevin Bourrillion wrote: > > On Fri, Apr 20, 2018 at 11:55 AM, Guy Steele > wrote: > > You know, if s2 is short (say, less than 30 or 40 characters), there are worse things than writing > > case A -> s1; > case null -> s2; > default -> s2; > > especially if you use spaces (as I just did) to line up the two occurrences of s2 to make it easy to see they are identical. > > And if s2 is long, there are worse things than making a little sub method to handle it: > > case A -> s1; > case null -> frobboz(a, b); > default -> frobboz(a, b); > > int frobboz(int a, String b) { ? } > > And if even THAT is not satisfactory, well, there are worse things than giving up on the arrows and just using colons (and break, if needed). > > I think neither of these goes down well. Having to repeat yourself at all, while normal cases get to use comma, will feel very wrong. Having to abandon arrowform over this would be even worse. > > > BUT, on the other hand, if we wanted to: instead of, or in addition to, > > case pat1, pat2, pat3 -> s; > > we could allow the form > > case pat1 -> case pat2 -> case pat3 -> s; > > This seems like a step backward to me (whether it is instead or in addition). > > fwiw, I think `default, case null ->` is superior to all of these options. As an ad-hoc patch that solves this one problem, I agree. But let me make a revised pitch for allowing it ?in addition to? (which, now that I have pondered it over the weekend, I think is clearly the correct approach). (1) We have moved toward allowing ?arrow versus colon? to be a syntax choice that is COMPLETELY orthogonal to other choices about the use of `switch`. If this rule is to hold universally, then any switch statement or expression should be convertible between the arrow form and the colon form using a simple, uniform rule. (2) In switch expressions we want to be able to use the concise notation `case a, b, c -> s;` for a switch clause. (3) From (1) and (2) we inexorably conclude that `case a, b, c: break s;` must also be a valid syntax. (4) But we could also have written (3) as `case a: case b: case c: break s;` and we certainly expect them to have equivalent behavior. (5) From (1) and (4) we conclude that we ought also be to be able to write `case a -> case b -> case c -> s;`. Notice that so far I have said nothing about the ?default and null? problem being a motivation. It?s all about preserving assumption (1). The issue with default and null has nothing to do with the issue of arrow versus colon. It *does* have to do with the issue of repeating the `case` keyword versus listing multiple values (or patterns) by using commas after a single `case` keyword: `default` does not play well with the comma-separated case when you use colons, so there is no reason to expect it to play well when you use arrows. Trying to make a special-case exception for `default` in the arrow case requires making a similarly ugly exception in the colon case if assumption (1) is to be preserved. I argue that there is no need to make the special-case exception for `default`. When you need to play that game (usually because null needs to be addressed), you cannot use the comma-separation syntax. Instead, just say either case null: default: s; (or, if you prefer, `default: case null: s;`) or case null -> default -> s; (or, if you prefer, `default -> case null -> s;`) depending on whether your `switch` is using colon syntax or arrow syntax. The latter is just two characters longer than `default, case null -> s;` and has a much simpler and more consistent underlying theory. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 23 18:27:27 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 14:27:27 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> Message-ID: <59752747-20F0-49E0-94E0-DA54FB9E6931@oracle.com> Good point, R?mi. However, note that case pat1, pat2 -> s is equally too close to case pat1 -> pat2 -> s and again they have very different meanings. We have to admit that there is room to blunder with this syntax. One way out would be to use a different arrow for `switch` statements: switch (x) { case pat1 => case pat2 => s1; case pat3 => pat4 -> s2; case pat5, pat6 => s2; case pat 7, pat8 => pat9 -> s4; } > On Apr 23, 2018, at 2:29 PM, Remi Forax wrote: > > I agree with Kevin, > case pat1 -> case pat2 -> s > is too close to > case pat1 -> pat2 -> s > which has a very different meaning (the result is a lambda). > > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 23 18:32:26 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 14:32:26 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: <59752747-20F0-49E0-94E0-DA54FB9E6931@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> <59752747-20F0-49E0-94E0-DA54FB9E6931@oracle.com> Message-ID: <6109AA8E-6F32-4ED5-BFFA-2A77E4C279BE@oracle.com> > On Apr 23, 2018, at 2:27 PM, Guy Steele wrote: > > Good point, R?mi. However, note that > > case pat1, pat2 -> s > > is equally too close to > > case pat1 -> pat2 -> s > > and again they have very different meanings. > > We have to admit that there is room to blunder with this syntax. > > One way out would be to use a different arrow for `switch` statements: > > switch (x) { > case pat1 => case pat2 => s1; > case pat3 => pat4 -> s2; > case pat5, pat6 => s2; > case pat7, pat8 => pat9 -> s4; > } As a careful coder, if I did not have a separate arrow `=>` (and probably even if I did), I would use formatting and parentheses to convey my intent: switch (x) { case pat1 -> case pat2 -> s1; case pat3 -> (pat4 -> s2); case pat5, pat6 -> s2; case pat7, pat8 -> (pat9 -> s4); } ?Guy From forax at univ-mlv.fr Mon Apr 23 18:48:37 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 23 Apr 2018 20:48:37 +0200 (CEST) Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: <2102124457.376153.1524509317075.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Jeudi 19 Avril 2018 22:44:45 > Objet: [switch] Further unification on switch > We've been reviewing the work to date on switch expressions. Here's > where we are, and here's a possible place we might move to, which I like > a lot better than where we are now. > > ## Goals > > As a reminder, remember that the primary goal here is _not_ switch > expressions; switch expressions are supposed to just be an > uncontroversial waypoint on the way to the real goal, which is a more > expressive and flexible switch construct that works in a wider variety > of situations, including supporting patterns, being less hostile to > null, use as either an expression or a statement, etc. > > And the reason we think that improving switch is the right primary goal > is because a "do one of these based on ..." construct is _better_ than > the corresponding chain of if-else-if, for multiple reasons: > > ?- Possibility for the compiler to do exhaustiveness analysis, > potentially finding more bugs; > ?- Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); and we can also create the dispatch code dynamically, which can be more efficient (when calling the switch with only a few values) > ?- More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > ?- Eliminates the need to repeat (and possibly get wrong) the switch > target. > > Switch does come with a lot of baggage (fallthrough by default, > questionable scoping, need to explicitly break), and this baggage has > produced the predictable distractions in the discussion -- a desire that > we subordinate the primary goal (making switch more expressive) to the > more contingent goal of "fixing" the legacy problems of switch. > > These legacy problems of switch may be unfortunate, but to whatever > degree we end up ameliorating these, this has to be purely a > side-benefit -- it's not the primarily goal, no matter how annoying > people find them.? (The desire to "fix" the mistakes of the past is > frequently a siren song, which is why we don't allow ourselves to take > these as first-class requirements.) > > #### What we're not going to do > > The worst possible outcome (which is also the most commonly suggested > "solution" in forums like reddit) would be to invent a new construct > that is similar to, but not quite the same as switch (`snitch`), without > being a 100% replacement for today's quirky switch.? Today's switch is > surely suboptimal, but it's not so fatally flawed that it needs to be > euthanized, and we don't want to create an "undead" language construct > forever, which everyone will still have to learn, and keep track of the > differences between `switch` and `snitch`.? No thank you. > > That means we extend the existing switch statement, and increase > flexibility by supporting an expression form, and to the degree needed, > embrace its quirks.? ("No statement left behind.") > > #### Where we started > > In the first five minutes of working on this project, we sketched out > the following (call it the "napkin sketch"), where an expression switch > has case arms of the form: > > ?? case L -> e; > or > ?? case L -> { statement*; break e; } > > This was enough to get started, but of course the devil is in the details. > > #### Where we are right now > > We moved away from the napkin sketch for a few reasons, in part because > it seemed to be drawing us down the road towards switch and snitch -- > which was further worrying as we still had yet to deal with the > potential that pattern switch and constant switch might have differences > as well.? We want a unified model of switch that deals well enough with > all the cases -- expressions and statements, patterns and constants. > > Our current model (call this Unification Attempt #1, or UA1 for short) > is a step towards a unified model of switch, and this is a huge step > forward.? In this model, there's one switch construct, and there's one > set of control flow rules, including for break (like return, break takes > a value in a value context and is void in a void context). > > For convenience and safety, we then layered a shorthand atop > value-bearing switches, which is to interpret > > ??? case L -> e; > > as > > ??? case L: break e; > > expecting the shorter form would be used almost all the time.? (This has > a pleasing symmetry with the expression form of lambdas, and (at least > for expression switches) alleviates two of the legacy pain points. > Switch expressions have other things in common with lambdas too; they > are the only ones that can have statements; they are the only ones that > interact with nonlocal control flow.) > > This approach offers a lot of flexibility (some would say too much). > You can write "remi-style" expression switches: > > ??? int x = switch (y) { > ??????? case 1: break 2; > ??????? case 2: break 4; > ??????? default: break 8; > ??? }; > > or you can write "new-style" expression switches: > > ??? int x = switch (y) { > ??????? case 1 -> 2; > ??????? case 2-> 4; > ??????? default-> 8; > ??? }; > > Some people like the transparency of the first; others like the > compactness and fallthrough-safety of the second.? And in cases where > you mostly want the benefits of the second, but the real world conspires > to make one or two cases difficult, you can mix them, and take full > advantage of what "old switch" does -- with no new rules for control flow. > > #### Complaints > > There were the usual array of complaints over syntax -- many of which > can be put down to "bleah, new is different, different is bad", but the > most prominent one seems to be a generalized concern that other users > (never us, of course, but we always fear for what others might do) won't > be able to "handle" the power of mixed switches and will write terrible > code, and then the world will burn.? (And, because the mixing comes with > fallthrough, it further engenders the "you idiots, you fixed the wrong > thing" reactions.) Personally, I think the fear of mixing is deeply > overblown -- I think in most cases people will gravitate towards one of > the two clean styles, and only mix where the complexity of the real > world forces them to, but there's value in understanding the > underpinnings of such reactions, even if in the end they'd turn out to > be much hot air about nothing. > > #### A real issue with mixing! > > But, there is a real problem with our approach, which is: while a > unified switch is the right goal, UA1 is not unified _enough_. > Specifically, we haven't fully aligned the statement forms, and this > conspires to reduce expressiveness and safety.? That is, in an > expression switch you can say: > > ??? case L -> e; > > but in a statement switch you can't say > > ??? case L -> s; > > The reason for this is a purely accidental one: if we allowed this, then > we _would_ likely find ourselves in the mixing hell that people are > afraid of, which in turn would make the risk of accidental fallthrough > _even worse_ than it is today.? So the failing of mixing is not that it > will be abused, but that it constrains us from actually getting to a > unified construct. good argument ! > > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it.? Let's say that _all_ switches > can support either old-style (colon) or new-style (arrow) case labels -- > but must stick to one kind of case label in a given switch: > > ??? // statement switch > ??? switch (x) { > ??????? case 1: println("one"); break; > ??????? case 2: println("two"); break; > ??? } > > or > > ??? // also statement switch > ??? switch (x) { > ??????? case 1 -> println("one"); > ??????? case 2 -> println("two"); > ??? } > > If a switch is a statement, the RHS is a statement, which can be a block > statement: > > ??? case L -> { a; b; } > > We get there by first taking a step backwards, at least in terms of > superficial syntax, to the syntax suggested by the napkin sketch, where > if a switch is an expression, the RHS of an -> case is an expression or > a block statement (in the latter case, it must complete abruptly by > reason of either break-value or throw).? Just as we expected "break > value" to be rare in expression switches under UA1 since developers will > generally prefer the shorthand form where applicable, we expect it to be > equally rare under UA2. > > Then, as in UA1, we render unto expressions the things that belong to > expressions; they must be total (an expression must yield a value or > complete abruptly by reason of throwing.) > > #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches.? Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what > they want "for free", with no need to treat it specially. In fact, its > even better; in the all-arrow form, all of the things people hate about > switch -- the need to say break, the risk of fallthrough, and the > questionable scoping -- all go away. > > #### Scorecard > > There is one switch construct, which can be use as either an expression > or a statement; when used as an expression, it acquires the > characteristics of expressions (must be total, no nonlocal control flow > out.)? Each can be expressed in one of two syntactic forms (arrow and > colon.)? All forms will support patterns, null handling, and multiple > labels per case.? The control flow and scoping rules are driven by > structural properties of the chosen form. > > The (statement, colon) case is the switch we have since Java 1.0, > enhanced as above (patterns, nulls, etc.) > > The (statement, arrow) case can be considered a nice syntactic shorthand > for the previous, which obviates the annoyance of "break", implicitly > prevents fallthrough of all forms, and avoids the confusion of current > switch scoping.? Many existing statement switches that are not > expressions in disguise can be refactored to this. > > The (expression, colon) form is a subset of UA1, where you just never > say "arrow". > > The (expression, arrow) case can again be considered a nice shorthand > for the previous, again a subset of UA1, where you just never say > "colon", and as a result, again don't have to think about fallthrough. > > Totality is a property of expression switches, regardless of form, > because they are expressions, and expressions must be total. > > Fallthrough is a property of the colon-structured switches; there are no > changes here. > > Nonlocal control flow _out_ of a switch (continue to an enclosing loop, > break with label, return) are properties of statement switches. > > So essentially, rather than dividing the semantics along > expression/statement lines, and then attempting to opportunistically > heap a bunch of irrelevant features like "no fallthrough" onto the > expression side "because they're cool" even though they have nothing to > do with expression-ness, we instead divide the world structurally: the > colon form gives you the old control flow, and the arrow form gives you > the new.? And either can be used as a statement, or an expression.? And > no one will be confused by mixing. > > Orthogonality FTW.? No statement gets left behind. > > ## Explaining it > > Relative to UA1, we could describe this as adding back the blocks (its > not really a block expression) from the napkin model, supporting an > arrow form of statement switches with blocks too, and then restricting > switches to all-arrow or all-colon.? Then each quadrant is a restriction > of this model.? But that's not how we'd teach it. > > Relative to Java 10, we'd probably say: > > ?- Switch statements now come in a simpler (arrow) flavor, where there > is no fallthrough, no weird scoping, and no need to say break most of > the time.? Many switches can be rewritten this way, and this form can > even be taught first. > ?- Switches can be used as either expressions or statements, with > essentially identical syntax (some grammar differences, but this is > mostly interesting only to spec writers).? If a switch is an expression, > it should contain expressions; if a switch is a statement, it should > contain statements. > ?- Expression switches have additional restrictions that are derived > exclusively from their expression-ness: totality, can only complete > abruptly if by reason of throw. > ?- We allow a break-with-value statement in an expression switch as a > means of explicitly providing the switch result; this can be combined > with a statement block to allow for statements+break-expression. > > The result is one switch construct, with modern and legacy flavors, > which supports either expressions or statements.? You can immediately > look at the middle of a switch and tell (by arrow vs colon) whether it > has the legacy control flow or not. I really like this proposal, the main issue i see (and i've already said that) is that now when you see -> { ... } in a Java code, it's not clear if it's open a function scope or a block scope, but given how far we goes into the rabbit hole when we tried to stick with the colon syntax and the fact that nobody among the conference attendees i've discussed with seem to care, going full arrows with the interesting spin of letting the statement switch to be refactored to use -> is the way to go. R?mi From kevinb at google.com Mon Apr 23 18:58:48 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 23 Apr 2018 11:58:48 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: On Mon, Apr 23, 2018 at 11:20 AM, Guy Steele wrote: (1) We have moved toward allowing ?arrow versus colon? to be a syntax > choice that is COMPLETELY orthogonal to other choices about the use of > `switch`. If this rule is to hold universally, then any switch statement > or expression should be convertible between the arrow form and the colon > form using a simple, uniform rule. > > (2) In switch expressions we want to be able to use the concise notation > `case a, b, c -> s;` for a switch clause. > > (3) From (1) and (2) we inexorably conclude that `case a, b, c: break s;` > must also be a valid syntax. > > (4) But we could also have written (3) as `case a: case b: case c: break > s;` and we certainly expect them to have equivalent behavior. > > (5) From (1) and (4) we conclude that we ought also be to be able to write > `case a -> case b -> case c -> s;`. > Not necessarily, if one simply views (4) as being an artifact of colonform switch's capacity for fall-through, which we know should not carry over. (Although we technically don't use the term "fall-through" in this no-intervening-code case, it works the same way and many people do think of it that way.) -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Apr 23 19:07:13 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 23 Apr 2018 21:07:13 +0200 (CEST) Subject: JEP325: Switch expressions spec In-Reply-To: <6109AA8E-6F32-4ED5-BFFA-2A77E4C279BE@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> <59752747-20F0-49E0-94E0-DA54FB9E6931@oracle.com> <6109AA8E-6F32-4ED5-BFFA-2A77E4C279BE@oracle.com> Message-ID: <357832084.378693.1524510433185.JavaMail.zimbra@u-pem.fr> '->' being a two characters symbol (at least if you do not enable font ligature of your IDE/editor) is a more strong separator than comma ',', so i think it's easy to visually parse case pat1, pat2 -> s as case [pat1, pat2] -> s than case pat1, [pat2 -> s] R?mi ----- Mail original ----- > De: "Guy Steele" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 23 Avril 2018 20:32:26 > Objet: Re: JEP325: Switch expressions spec >> On Apr 23, 2018, at 2:27 PM, Guy Steele wrote: >> >> Good point, R?mi. However, note that >> >> case pat1, pat2 -> s >> >> is equally too close to >> >> case pat1 -> pat2 -> s >> >> and again they have very different meanings. >> >> We have to admit that there is room to blunder with this syntax. >> >> One way out would be to use a different arrow for `switch` statements: >> >> switch (x) { >> case pat1 => case pat2 => s1; >> case pat3 => pat4 -> s2; >> case pat5, pat6 => s2; >> case pat7, pat8 => pat9 -> s4; >> } > > As a careful coder, if I did not have a separate arrow `=>` (and probably even > if I did), I would use formatting and parentheses to convey my intent: > > switch (x) { > case pat1 -> > case pat2 -> s1; > case pat3 -> (pat4 -> s2); > case pat5, pat6 -> s2; > case pat7, pat8 -> (pat9 -> s4); > } > > ?Guy From guy.steele at oracle.com Mon Apr 23 19:00:55 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 15:00:55 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: > On Apr 23, 2018, at 2:58 PM, Kevin Bourrillion wrote: > > On Mon, Apr 23, 2018 at 11:20 AM, Guy Steele > wrote: > > (1) We have moved toward allowing ?arrow versus colon? to be a syntax choice that is COMPLETELY orthogonal to other choices about the use of `switch`. If this rule is to hold universally, then any switch statement or expression should be convertible between the arrow form and the colon form using a simple, uniform rule. > > (2) In switch expressions we want to be able to use the concise notation `case a, b, c -> s;` for a switch clause. > > (3) From (1) and (2) we inexorably conclude that `case a, b, c: break s;` must also be a valid syntax. > > (4) But we could also have written (3) as `case a: case b: case c: break s;` and we certainly expect them to have equivalent behavior. > > (5) From (1) and (4) we conclude that we ought also be to be able to write `case a -> case b -> case c -> s;`. > > Not necessarily, if one simply views (4) as being an artifact of colonform switch's capacity for fall-through, which we know should not carry over. (Although we technically don't use the term "fall-through" in this no-intervening-code case, it works the same way and many people do think of it that way.) You could view it that way?but such a view is incorrect, going back to JLS1. I know that many people develop alternate ?folk models? for how they think things work or ought to work, but sometimes such alternate models lead one astray. This is one such instance. This is why we have a spec. My argument does not rely on fallthrough; it relies on the notion of a case label being ?associated with? a statement. JLS is quite clear on this. To be more explicit about assumption (1) above: I propose the simple, uniform rule that the way to convert colon form to arrow form is to (a) replace every colon in a SwitchLabel with an arrow, then (b) add braces and ?break? where necessary (and the rules for this depend on whether you are converting a statement switch or an expression switch). The point of this definition is that step (a) need not require any exceptions or special cases. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 23 19:07:41 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 15:07:41 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: <357832084.378693.1524510433185.JavaMail.zimbra@u-pem.fr> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> <1665147122.374932.1524508168404.JavaMail.zimbra@u-pem.fr> <59752747-20F0-49E0-94E0-DA54FB9E6931@oracle.com> <6109AA8E-6F32-4ED5-BFFA-2A77E4C279BE@oracle.com> <357832084.378693.1524510433185.JavaMail.zimbra@u-pem.fr> Message-ID: I agree; and that is exactly why I would throw in a newline in situations where I cared to use two arrows: case pat1 -> case pat2 -> s1; is much to be preferred to case pat1 -> case pat2 -> s1; Look, I also think that it?s bad form to write a = b ? c : d; because of the risk of misinterpretation; I would write a = (b ? c : d); I don?t think the risks of switch syntax are worse (or better) than this. > On Apr 23, 2018, at 3:07 PM, forax at univ-mlv.fr wrote: > > '->' being a two characters symbol (at least if you do not enable font ligature of your IDE/editor) is a more strong separator than comma ',', > so i think it's easy to visually parse > case pat1, pat2 -> s > as > case [pat1, pat2] -> s > than > case pat1, [pat2 -> s] > > R?mi > > ----- Mail original ----- >> De: "Guy Steele" >> ?: "Remi Forax" >> Cc: "amber-spec-experts" >> Envoy?: Lundi 23 Avril 2018 20:32:26 >> Objet: Re: JEP325: Switch expressions spec > >>> On Apr 23, 2018, at 2:27 PM, Guy Steele wrote: >>> >>> Good point, R?mi. However, note that >>> >>> case pat1, pat2 -> s >>> >>> is equally too close to >>> >>> case pat1 -> pat2 -> s >>> >>> and again they have very different meanings. >>> >>> We have to admit that there is room to blunder with this syntax. >>> >>> One way out would be to use a different arrow for `switch` statements: >>> >>> switch (x) { >>> case pat1 => case pat2 => s1; >>> case pat3 => pat4 -> s2; >>> case pat5, pat6 => s2; >>> case pat7, pat8 => pat9 -> s4; >>> } >> >> As a careful coder, if I did not have a separate arrow `=>` (and probably even >> if I did), I would use formatting and parentheses to convey my intent: >> >> switch (x) { >> case pat1 -> >> case pat2 -> s1; >> case pat3 -> (pat4 -> s2); >> case pat5, pat6 -> s2; >> case pat7, pat8 -> (pat9 -> s4); >> } >> >> ?Guy From guy.steele at oracle.com Mon Apr 23 19:29:52 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 15:29:52 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: <78A8B076-B29C-45E4-B340-7EF5018DED13@oracle.com> > On Apr 23, 2018, at 3:00 PM, Guy Steele wrote: > . . . > You could view it that way?but such a view is incorrect, going back to JLS1. > > I know that many people develop alternate ?folk models? for how they think things work or ought to work, but sometimes such alternate models lead one astray. This is one such instance. This is why we have a spec. > > My argument does not rely on fallthrough; it relies on the notion of a case label being ?associated with? a statement. JLS is quite clear on this. Alex Buckley has kindly pointed out to me that the preceding remark is in error. I misread the wording in JLS1 14.9 (and JDK8 JLS 14.11), paragraph 3. A case label is ?associated with? the containing `switch` statement, not the statement that follows it in the SwitchBlockStatementGroup. I should have said: My argument does not rely on fallthrough; it relies on the notion of executing all statements after the matching case label in the switch block. This execution of statements, once initiated, may then ?fall through? labels, but the initiation of this execution (specifically, initiation of execution of the first such statement) does not fall through labels. JLS is quite clear on this. This is also confirmed by the first sentence in that section: ?The `switch` statement transfers control to one of several statements depending on the value of an expression.? Despite the fact that one might like to think of "transferring control to a case label? rather than to a statement, the official model is that control is transferred directly to a statement. > To be more explicit about assumption (1) above: I propose the simple, uniform rule that the way to convert colon form to arrow form is to (a) replace every colon in a SwitchLabel with an arrow, then (b) add braces and ?break? where necessary (and the rules for this depend on whether you are converting a statement switch or an expression switch). The point of this definition is that step (a) need not require any exceptions or special cases. > > ?Guy > From kevinb at google.com Mon Apr 23 20:22:24 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 23 Apr 2018 13:22:24 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: On Mon, Apr 23, 2018 at 12:00 PM, Guy Steele wrote: > On Apr 23, 2018, at 2:58 PM, Kevin Bourrillion wrote: > > On Mon, Apr 23, 2018 at 11:20 AM, Guy Steele > wrote: > > (1) We have moved toward allowing ?arrow versus colon? to be a syntax >> choice that is COMPLETELY orthogonal to other choices about the use of >> `switch`. If this rule is to hold universally, then any switch statement >> or expression should be convertible between the arrow form and the colon >> form using a simple, uniform rule. >> >> (2) In switch expressions we want to be able to use the concise notation >> `case a, b, c -> s;` for a switch clause. >> >> (3) From (1) and (2) we inexorably conclude that `case a, b, c: break s;` >> must also be a valid syntax. >> >> (4) But we could also have written (3) as `case a: case b: case c: break >> s;` and we certainly expect them to have equivalent behavior. >> >> (5) From (1) and (4) we conclude that we ought also be to be able to >> write `case a -> case b -> case c -> s;`. >> > > Not necessarily, if one simply views (4) as being an artifact of colonform > switch's capacity for fall-through, which we know should not carry over. > (Although we technically don't use the term "fall-through" in this > no-intervening-code case, it works the same way and many people do think of > it that way.) > > You could view it that way?but such a view is incorrect, going back to > JLS1. > To be more clear, I wasn't trying to make a statement about what is correct or incorrect by the spec. (On such matters I will always be deferring to the rest of you!) My claim is just that it is not hard for a user to make sense of why `case A: case B: x` would work in colonform yet `case A -> case B -> x` might not work in arrowform. These don't necessarily feel contradictory. A user may simply understand that since colonform's design is made to support fall-through, that became an obvious way that *it* could address multiple labels as well, whereas the same does not apply in arrowform. Okay, so it's a "folk model". I think that neither makes it automatically good nor automatically bad. To the user, I believe that the ability to choose `case A -> case B ->` is an unnecessary choice and feels like the same kind of baggage that I'd hoped to leave behind when moving to arrowform. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 23 22:48:53 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 23 Apr 2018 18:48:53 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <144bd9d6-fca8-eec5-db85-156d2ad98519@oracle.com> <55cf153c-c602-24d6-8675-ee321b2d9a8d@oracle.com> Message-ID: <9829948F-23FD-4765-98EA-ABE668D5281A@oracle.com> Sent from my iPhone > On Apr 23, 2018, at 4:22 PM, Kevin Bourrillion wrote: > > On Mon, Apr 23, 2018 at 12:00 PM, Guy Steele wrote: >>> On Apr 23, 2018, at 2:58 PM, Kevin Bourrillion wrote: >>> >>>> On Mon, Apr 23, 2018 at 11:20 AM, Guy Steele wrote: >>>> >>>> (1) We have moved toward allowing ?arrow versus colon? to be a syntax choice that is COMPLETELY orthogonal to other choices about the use of `switch`. If this rule is to hold universally, then any switch statement or expression should be convertible between the arrow form and the colon form using a simple, uniform rule. >>>> >>>> (2) In switch expressions we want to be able to use the concise notation `case a, b, c -> s;` for a switch clause. >>>> >>>> (3) From (1) and (2) we inexorably conclude that `case a, b, c: break s;` must also be a valid syntax. >>>> >>>> (4) But we could also have written (3) as `case a: case b: case c: break s;` and we certainly expect them to have equivalent behavior. >>>> >>>> (5) From (1) and (4) we conclude that we ought also be to be able to write `case a -> case b -> case c -> s;`. >>> >>> Not necessarily, if one simply views (4) as being an artifact of colonform switch's capacity for fall-through, which we know should not carry over. (Although we technically don't use the term "fall-through" in this no-intervening-code case, it works the same way and many people do think of it that way.) >> >> You could view it that way?but such a view is incorrect, going back to JLS1. > > To be more clear, I wasn't trying to make a statement about what is correct or incorrect by the spec. (On such matters I will always be deferring to the rest of you!) > > My claim is just that it is not hard for a user to make sense of why `case A: case B: x` would work in colonform yet `case A -> case B -> x` might not work in arrowform. These don't necessarily feel contradictory. A user may simply understand that since colonform's design is made to support fall-through, that became an obvious way that it could address multiple labels as well, whereas the same does not apply in arrowform. Fair enough. > Okay, so it's a "folk model". I think that neither makes it automatically good nor automatically bad. > > To the user, I believe that the ability to choose `case A -> case B ->` is an unnecessary choice and feels like the same kind of baggage that I'd hoped to leave behind when moving to arrowform. Yep. Agreed that you would almost never want to use it. But it would give me a warm, fuzzy feeling to know it?s lurking there for the few times you really need it -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 24 06:51:25 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 24 Apr 2018 08:51:25 +0200 Subject: [switch] Further unification on switch In-Reply-To: <2102124457.376153.1524509317075.JavaMail.zimbra@u-pem.fr> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> <2102124457.376153.1524509317075.JavaMail.zimbra@u-pem.fr> Message-ID: <09F1288D-FA7B-4027-8A25-7DBF442F194D@oracle.com> What did your poll say? > On Apr 23, 2018, at 8:48 PM, Remi Forax wrote: > > the fact that nobody among the conference attendees i've discussed with seem to care -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Tue Apr 24 21:59:29 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 24 Apr 2018 14:59:29 -0700 Subject: [switch] Further unification on switch In-Reply-To: References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> <856ABF99-276C-4DAF-B92F-F9CB60A64C01@oracle.com> Message-ID: (Moving a reply from -observers to -experts where I'd meant it to be; and then correcting it because it was wrong. This isn't a key point of debate though, so apologies for the nuisance.) On Fri, Apr 20, 2018 at 7:15 AM, Kevin Bourrillion wrote: > On Fri, Apr 20, 2018 at 4:09 AM, Victor Nazarov > wrote: > > This proposal seems alright. But isn't it a division instead of a >> unification? >> The main argument against new shiny match expression was that is will >> exists in parallel with old rusty switch statement. >> Instead the decision was to enhance switch statement to cover use cases of >> match expression. >> >> With last proposal what we get is: old rusty familiar switch >> statement/expression and new shiny arrow-switch statement/expression? >> I see it as the same division that we tried to avoid: two similar, but not >> quite the same syntax-forms. >> > > Well, at least they are only different on the *inside*. Whenever looking > at one from the outside, it is still the same black box. (Actually two > black boxes, but *that* split is expression vs. statement.) That's > something we wouldn't have with `match` or a new operator, and that's > something. > > Here's a fun statistic from Google's codebase. We analyzed every > hand-written switch statement in our depot. Only *2.4%* of them used > fall-through. The number in the real world might be somewhat higher (?), > but we know it to be quite small. Yet, the things about switch that have > been weird, confusing, or dangerous all stem from that fall-through model > that such a tiny fraction need -- how sad! > We were accidentally including switch statements that had only omitted `break` on the final statement group. The new number is 1.2%. It also appears that very roughly one-third of these are getting such trivial benefit from fall-through that they could *still* change to arrowform pretty happily anyway. These stats are mostly just useful for confirming that, yes, there is real positive value in this feature. > I think the highest order bit in this discussion, by far, is that we have > found a way to make >95% of all switch statements a lot easier to write and > read. So, *even if* some of us see this as us doing exactly what Brian > said we shouldn't (in his "What we're not going to do") section, I *still* think > it easily clears the bar and will be a very successful change. > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 24 23:09:16 2018 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 25 Apr 2018 01:09:16 +0200 (CEST) Subject: [switch] Further unification on switch In-Reply-To: <09F1288D-FA7B-4027-8A25-7DBF442F194D@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> <2102124457.376153.1524509317075.JavaMail.zimbra@u-pem.fr> <09F1288D-FA7B-4027-8A25-7DBF442F194D@oracle.com> Message-ID: <108760264.812488.1524611356575.JavaMail.zimbra@u-pem.fr> first question, do you prefer break syntax or -> syntax 119 for break, 173 for -> second question, do you prefer break syntax, -> syntax or ':' + expression syntax 43 for break, 94 for ->, 155 for ':' + expr R?mi > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mardi 24 Avril 2018 08:51:25 > Objet: Re: [switch] Further unification on switch > What did your poll say? >> On Apr 23, 2018, at 8:48 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | >> forax at univ-mlv.fr ] > wrote: >> the fact that nobody among the conference attendees i've discussed with seem to >> care -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.petrashko at gmail.com Wed Apr 25 04:02:29 2018 From: dmitry.petrashko at gmail.com (Dmitry Petrashko) Date: Tue, 24 Apr 2018 21:02:29 -0700 Subject: [switch] Further unification on switch In-Reply-To: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> References: <88ba33f3-5c6a-62ac-21ad-f703e705f0cc@oracle.com> Message-ID: I like this proposal and, in particular, I strongly support " ## Closing the gap" section. Enforcing uniform style on every particular switch allows to have clean and intuitive semantics for arrow switches while giving a straightforward migration path that can be assisted by tools to old-style ones. -Dmitry On Thu, Apr 19, 2018 at 1:44 PM, Brian Goetz wrote: > We've been reviewing the work to date on switch expressions. Here's where > we are, and here's a possible place we might move to, which I like a lot > better than where we are now. > > ## Goals > > As a reminder, remember that the primary goal here is _not_ switch > expressions; switch expressions are supposed to just be an uncontroversial > waypoint on the way to the real goal, which is a more expressive and > flexible switch construct that works in a wider variety of situations, > including supporting patterns, being less hostile to null, use as either an > expression or a statement, etc. > > And the reason we think that improving switch is the right primary goal is > because a "do one of these based on ..." construct is _better_ than the > corresponding chain of if-else-if, for multiple reasons: > > - Possibility for the compiler to do exhaustiveness analysis, potentially > finding more bugs; > - Possibility for more efficient dispatch -- a switch could be O(1), > whereas an if-else chain is almost certainly O(n); > - More semantically transparent -- it's obvious the user is saying "do > one of these, based on ..."; > - Eliminates the need to repeat (and possibly get wrong) the switch > target. > > Switch does come with a lot of baggage (fallthrough by default, > questionable scoping, need to explicitly break), and this baggage has > produced the predictable distractions in the discussion -- a desire that we > subordinate the primary goal (making switch more expressive) to the more > contingent goal of "fixing" the legacy problems of switch. > > These legacy problems of switch may be unfortunate, but to whatever degree > we end up ameliorating these, this has to be purely a side-benefit -- it's > not the primarily goal, no matter how annoying people find them. (The > desire to "fix" the mistakes of the past is frequently a siren song, which > is why we don't allow ourselves to take these as first-class requirements.) > > #### What we're not going to do > > The worst possible outcome (which is also the most commonly suggested > "solution" in forums like reddit) would be to invent a new construct that > is similar to, but not quite the same as switch (`snitch`), without being a > 100% replacement for today's quirky switch. Today's switch is surely > suboptimal, but it's not so fatally flawed that it needs to be euthanized, > and we don't want to create an "undead" language construct forever, which > everyone will still have to learn, and keep track of the differences > between `switch` and `snitch`. No thank you. > > That means we extend the existing switch statement, and increase > flexibility by supporting an expression form, and to the degree needed, > embrace its quirks. ("No statement left behind.") > > #### Where we started > > In the first five minutes of working on this project, we sketched out the > following (call it the "napkin sketch"), where an expression switch has > case arms of the form: > > case L -> e; > or > case L -> { statement*; break e; } > > This was enough to get started, but of course the devil is in the details. > > #### Where we are right now > > We moved away from the napkin sketch for a few reasons, in part because it > seemed to be drawing us down the road towards switch and snitch -- which > was further worrying as we still had yet to deal with the potential that > pattern switch and constant switch might have differences as well. We want > a unified model of switch that deals well enough with all the cases -- > expressions and statements, patterns and constants. > > Our current model (call this Unification Attempt #1, or UA1 for short) is > a step towards a unified model of switch, and this is a huge step forward. > In this model, there's one switch construct, and there's one set of control > flow rules, including for break (like return, break takes a value in a > value context and is void in a void context). > > For convenience and safety, we then layered a shorthand atop value-bearing > switches, which is to interpret > > case L -> e; > > as > > case L: break e; > > expecting the shorter form would be used almost all the time. (This has a > pleasing symmetry with the expression form of lambdas, and (at least for > expression switches) alleviates two of the legacy pain points. Switch > expressions have other things in common with lambdas too; they are the only > ones that can have statements; they are the only ones that interact with > nonlocal control flow.) > > This approach offers a lot of flexibility (some would say too much). You > can write "remi-style" expression switches: > > int x = switch (y) { > case 1: break 2; > case 2: break 4; > default: break 8; > }; > > or you can write "new-style" expression switches: > > int x = switch (y) { > case 1 -> 2; > case 2-> 4; > default-> 8; > }; > > Some people like the transparency of the first; others like the > compactness and fallthrough-safety of the second. And in cases where you > mostly want the benefits of the second, but the real world conspires to > make one or two cases difficult, you can mix them, and take full advantage > of what "old switch" does -- with no new rules for control flow. > > #### Complaints > > There were the usual array of complaints over syntax -- many of which can > be put down to "bleah, new is different, different is bad", but the most > prominent one seems to be a generalized concern that other users (never us, > of course, but we always fear for what others might do) won't be able to > "handle" the power of mixed switches and will write terrible code, and then > the world will burn. (And, because the mixing comes with fallthrough, it > further engenders the "you idiots, you fixed the wrong thing" reactions.) > Personally, I think the fear of mixing is deeply overblown -- I think in > most cases people will gravitate towards one of the two clean styles, and > only mix where the complexity of the real world forces them to, but there's > value in understanding the underpinnings of such reactions, even if in the > end they'd turn out to be much hot air about nothing. > > #### A real issue with mixing! > > But, there is a real problem with our approach, which is: while a unified > switch is the right goal, UA1 is not unified _enough_. Specifically, we > haven't fully aligned the statement forms, and this conspires to reduce > expressiveness and safety. That is, in an expression switch you can say: > > case L -> e; > > but in a statement switch you can't say > > case L -> s; > > The reason for this is a purely accidental one: if we allowed this, then > we _would_ likely find ourselves in the mixing hell that people are afraid > of, which in turn would make the risk of accidental fallthrough _even > worse_ than it is today. So the failing of mixing is not that it will be > abused, but that it constrains us from actually getting to a unified > construct. > > ## Closing the gap > > So, let's take one more step towards unifying the two forms (call this > UA2), rather than a step away from it. Let's say that _all_ switches can > support either old-style (colon) or new-style (arrow) case labels -- but > must stick to one kind of case label in a given switch: > > // statement switch > switch (x) { > case 1: println("one"); break; > case 2: println("two"); break; > } > > or > > // also statement switch > switch (x) { > case 1 -> println("one"); > case 2 -> println("two"); > } > > If a switch is a statement, the RHS is a statement, which can be a block > statement: > > case L -> { a; b; } > > We get there by first taking a step backwards, at least in terms of > superficial syntax, to the syntax suggested by the napkin sketch, where if > a switch is an expression, the RHS of an -> case is an expression or a > block statement (in the latter case, it must complete abruptly by reason of > either break-value or throw). Just as we expected "break value" to be rare > in expression switches under UA1 since developers will generally prefer the > shorthand form where applicable, we expect it to be equally rare under UA2. > > Then, as in UA1, we render unto expressions the things that belong to > expressions; they must be total (an expression must yield a value or > complete abruptly by reason of throwing.) > > #### Look, accidental benefits! > > Many of switches failings (fallthrough, scoping) are not directly > specified features, as much as emergent properties of the structure and > control flow of switches. Since by definition you can't fall out of a > arrow case, then an all-arrow switch gives the fallthrough-haters what they > want "for free", with no need to treat it specially. In fact, its even > better; in the all-arrow form, all of the things people hate about switch > -- the need to say break, the risk of fallthrough, and the questionable > scoping -- all go away. > > #### Scorecard > > There is one switch construct, which can be use as either an expression or > a statement; when used as an expression, it acquires the characteristics of > expressions (must be total, no nonlocal control flow out.) Each can be > expressed in one of two syntactic forms (arrow and colon.) All forms will > support patterns, null handling, and multiple labels per case. The control > flow and scoping rules are driven by structural properties of the chosen > form. > > The (statement, colon) case is the switch we have since Java 1.0, enhanced > as above (patterns, nulls, etc.) > > The (statement, arrow) case can be considered a nice syntactic shorthand > for the previous, which obviates the annoyance of "break", implicitly > prevents fallthrough of all forms, and avoids the confusion of current > switch scoping. Many existing statement switches that are not expressions > in disguise can be refactored to this. > > The (expression, colon) form is a subset of UA1, where you just never say > "arrow". > > The (expression, arrow) case can again be considered a nice shorthand for > the previous, again a subset of UA1, where you just never say "colon", and > as a result, again don't have to think about fallthrough. > > Totality is a property of expression switches, regardless of form, because > they are expressions, and expressions must be total. > > Fallthrough is a property of the colon-structured switches; there are no > changes here. > > Nonlocal control flow _out_ of a switch (continue to an enclosing loop, > break with label, return) are properties of statement switches. > > So essentially, rather than dividing the semantics along > expression/statement lines, and then attempting to opportunistically heap a > bunch of irrelevant features like "no fallthrough" onto the expression side > "because they're cool" even though they have nothing to do with > expression-ness, we instead divide the world structurally: the colon form > gives you the old control flow, and the arrow form gives you the new. And > either can be used as a statement, or an expression. And no one will be > confused by mixing. > > Orthogonality FTW. No statement gets left behind. > > ## Explaining it > > Relative to UA1, we could describe this as adding back the blocks (its not > really a block expression) from the napkin model, supporting an arrow form > of statement switches with blocks too, and then restricting switches to > all-arrow or all-colon. Then each quadrant is a restriction of this > model. But that's not how we'd teach it. > > Relative to Java 10, we'd probably say: > > - Switch statements now come in a simpler (arrow) flavor, where there is > no fallthrough, no weird scoping, and no need to say break most of the > time. Many switches can be rewritten this way, and this form can even be > taught first. > - Switches can be used as either expressions or statements, with > essentially identical syntax (some grammar differences, but this is mostly > interesting only to spec writers). If a switch is an expression, it should > contain expressions; if a switch is a statement, it should contain > statements. > - Expression switches have additional restrictions that are derived > exclusively from their expression-ness: totality, can only complete > abruptly if by reason of throw. > - We allow a break-with-value statement in an expression switch as a > means of explicitly providing the switch result; this can be combined with > a statement block to allow for statements+break-expression. > > The result is one switch construct, with modern and legacy flavors, which > supports either expressions or statements. You can immediately look at the > middle of a switch and tell (by arrow vs colon) whether it has the legacy > control flow or not. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Apr 27 15:03:50 2018 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 27 Apr 2018 16:03:50 +0100 Subject: JEP325: Switch expressions spec In-Reply-To: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> Message-ID: <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> I have uploaded the latest draft of the spec for JEP 325 at http://cr.openjdk.java.net/~gbierman/switch-expressions.html Changes from the last version: * Supports new -> label form in both switch expressions and switch statements * Added typing rules for switch expression * Restrict the type of a selector expression to not include long, double and float as previously proposed * Misc smaller changes from community feedback (thanks!) Comments welcomed! Gavin > On 12 Apr 2018, at 22:27, Gavin Bierman wrote: > > I have uploaded a draft spec for JEP 325: Switch expressions at http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > Note there are still three things missing: > > * There is no text about typing a switch expression, as this is still being discussed on this list. > * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. > * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. > > Comments welcomed! > Gavin From kevinb at google.com Fri Apr 27 15:59:45 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 27 Apr 2018 08:59:45 -0700 Subject: `case null:` (here we go) Message-ID: >From newest spec: *ClauseLabel*: *CasePattern ->* *CasePattern:* *case Pattern { , Pattern }* *default* *Pattern:* *ConstantExpression* *EnumConstantName* *null* Just a reminder that we still have this conflict to resolve. Even when you learn the wart that `default` is not covering null, there is no way to *make* it do so without repeating the RHS. We need `default, case null:`. Of course, my preferred resolution is to *hold off* on `case null` support for now. I'd like to give one honest attempt at a full argument; I hope to get an engaged response to it, and then if a decision is made against it, I'll move on, figure I was wrong, and never speak of it again. :-) ~~ 1. For the sake of argument I'll just concede the notion that today's null behavior in switch was a mistake and if we could turn back time we would change it. 2. On the flip side, I think that proponents of the feature can probably concede that we would *never* have designed it in the now-proposed way for a fresh language; it *is* a permanent wart brought about by historical accident only. (Yes?) So, I assume that proponents recognize that rejecting this change is at least *defensible*, by appealing to the compatibility constraints we inherited. We should not need to worry that we will "look like idiots" (or whatever terms our deepest fears phrase themselves in :-)). 3. Many users since 2004 have said, and will continue to say, that they wish `case null` were allowed. However, I don't think we can assume they are necessarily comparing today's behavior to the *actual* feature we are proposing, with its warts. It is likely and natural that they are really comparing today's behavior to the time-machine feature of our having supported `case null` from the start. Therefore I think we have to take most such requests with a grain of salt. 4. Yes, sometimes users do write code that simulates `case null` and that code could be nicely simplified if null were allowed in switch. I can do a better job of quantifying the incidence of this need in our large codebase if necessary (but have been prioritizing string literal research for now). But fundamentally, the feature *is* a win, for *these* examples. 5. It has been implied that patterns are what make the current null treatment "untenable". To me, I don't think this argument has been made convincingly yet. It seems to add up to "there may be a few more of those cases where you bump up against the prohibition and think 'oh yeah, grrrr, can't switch directly on null because of reasons no one understands!'" But fundamentally it seems like the same problem it already was. 6. That benefit has to be weighed against the damage we will be causing. Here is the meat of it: - `default` will no longer mean default. There is really no way around that. - Null will be treated unaccountably differently from all other values in switch. It becomes harder to explain how switch works -- "sorry, no null" is at least easy. Instead it's "well, switch *itself* allows null, but it assumes you want a `case null` that throws if you don't say otherwise". Looked at without knowing all the baggage, is this not a bit bizarre? - Also (back to how this email started), this appears to be the only factor forcing us to introduce a `default, case x` syntax we would never otherwise need - or to mint some other bespoke construction we would, again, never otherwise need. >From where I sit, the cost is clearly too great compared to the benefit. While "never doing anything at all about this" might not be the solution, I am at least confident that the current proposal is not the right solution either, and I'd like to convince us to bench it. But again - if we can make either decision clearly, I'll be done here, and not even grumpy. On Fri, Apr 27, 2018 at 8:03 AM, Gavin Bierman wrote: > I have uploaded the latest draft of the spec for JEP 325 at > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > Changes from the last version: > * Supports new -> label form in both switch expressions and switch > statements > * Added typing rules for switch expression > * Restrict the type of a selector expression to not include long, double > and float as previously proposed > * Misc smaller changes from community feedback (thanks!) > > Comments welcomed! > Gavin > > > On 12 Apr 2018, at 22:27, Gavin Bierman > wrote: > > > > I have uploaded a draft spec for JEP 325: Switch expressions at > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > > > Note there are still three things missing: > > > > * There is no text about typing a switch expression, as this is still > being discussed on this list. > > * There is no name given for the exception raised at runtime when a > switch expression fails to find a matching pattern label, as this is still > being discussed on this list. > > * The spec currently permits fall through from a "case pattern:? > statement group into a "case pattern ->" clause. We are still working > through the consequences of removing this possibility. > > > > Comments welcomed! > > Gavin > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vicente.romero at oracle.com Fri Apr 27 17:31:02 2018 From: vicente.romero at oracle.com (Vicente Romero) Date: Fri, 27 Apr 2018 13:31:02 -0400 Subject: [constables] RFR of constants API Message-ID: <451987dd-9371-552f-908d-e57fdc1f09ff@oracle.com> Hi all, Please review the current proposal of the constants API, which are nominal descriptor types defined in pkg java.lang.invoke.constant. The code can be found at [1]. This API is being developed in the context of JEP 303: Intrinsics for the LDC and INVOKEDYNAMIC Instructions [2] Thanks in advance for your comments, Vicente [1] http://cr.openjdk.java.net/~vromero/constant.api/webrev.00 [2] http://openjdk.java.net/jeps/303 From alex.buckley at oracle.com Fri Apr 27 21:55:29 2018 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 27 Apr 2018 14:55:29 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> Message-ID: <5AE39C51.7040300@oracle.com> On 4/27/2018 8:03 AM, Gavin Bierman wrote: > I have uploaded the latest draft of the spec for JEP 325 at http://cr.openjdk.java.net/~gbierman/switch-expressions.html 14.16 is right to say that: A break statement with value Expression ***attempts to cause the evaluation of the immediately enclosing switch expression*** to complete normally ... because the following is legal (x will become 200) : int x = switch (e) { case 1 -> { try { break 100; } finally { break 200; } } default -> 0; }; Therefore, in the discussion section, please say that: The preceding descriptions say "attempts to transfer control" ***and "attempts to cause evaluation to complete normally",*** rather than just "transfers control" ***and "causes evaluation to complete normally",*** because if there are any try statements ... ... innermost to outermost, before control is transferred to the break target ***or evaluation of the break target completes***. [Notice we don't yet know if evaluation of the break target will complete normally or abruptly. If the finally clause above was to throw an exception instead of break-200, then the switch expression would complete abruptly by reason of the exception, rather than completing normally with the value 100.] (Separately: Please flag the new text in 15.15's opening line.) Alex From brian.goetz at oracle.com Sat Apr 28 18:12:05 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 28 Apr 2018 20:12:05 +0200 Subject: [raw-string] indentation stripping Message-ID: <5157CBFA-7068-414E-8590-150713ED1B1A@oracle.com> This thread accidentally got started on the wrong list, so bringing it back here. The following messages are hereby read into the record (and hence can be considered to be under the proper terms of use for a specification list.) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003034.html (Jim #1) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003035.html (John) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003051.html (Kevin #1) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003052.html (Jim #2) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003053.html (Kevin #2) http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003054.html (Jim #3) A summary follows. The key point being discussed is that the ?raw means raw? interpretation for multi-line strings is likely to be at odds with how users actually plan to use the feature ? that they will pad the code with incidental indentation to make it line up nicely with the enclosing Java code, and that IDEs may well adjust said incidental indentation as the code is maintained ? and that this is a reasonable thing to encourage. Kevin?s data from the Google codebase backs up this supposition. Our design already admits this to some degree ? for multi-line strings, we don?t really believe the source file when it uses platform-specific line terminators. So we?re trying to distill how to distinguish ?incidental? indentation from intended indentation in multi-line strings. (More generally: the feedback we?ve gotten is that while raw strings is the right design center for single-line strings, when it comes to snippets that span lines, user care more about multi-line-ness than raw-ness.) Assumptions: - Most multi-line strings will be code snippets of some sort (JSON, XML, SQL, Java, etc); - Most developers will want to use incidental indentation to have code snippets indent ?sensibly? relative to neighboring Java code, but said incidental indentation is not part of the snippet. Jim?s #1 offers a catalog of ways in which users might craft multi-line string literals to fit cleanly into their source code, identifying which indentation is incidental and which is essential. To the goals, I?d add: - In addition to it being _possible_ to render the desired result, it should be straightforward for users to _predict_ the result of indentation stripping. Kevin adds: it would be useful if we could draw a ?rectangle? that excludes all incidental indentation and includes all intended indentation. Tabs are a confounding issue; since there is no standard interpretation for how many spaces correspond to a tab, in the general case no trimming algorithm will do well with mixed spaces and tabs. However, in the well-behaved case where lines begin with tab* space*, a common prefix can be stripped. There?s some reason to believe that calling .stripIndent() will be so common that it should be the default, rather than requiring users to invoke it every time. Now back to discussion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Apr 29 18:14:24 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 29 Apr 2018 20:14:24 +0200 (CEST) Subject: JEP325: Switch expressions spec In-Reply-To: <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> Message-ID: <1519264476.2415564.1525025664083.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Gavin Bierman" > ?: "amber-spec-experts" > Envoy?: Vendredi 27 Avril 2018 17:03:50 > Objet: Re: JEP325: Switch expressions spec > I have uploaded the latest draft of the spec for JEP 325 at > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > Changes from the last version: > * Supports new -> label form in both switch expressions and switch statements > * Added typing rules for switch expression > * Restrict the type of a selector expression to not include long, double and > float as previously proposed I do not care about double and float, doing == on a double or a float is already dubious, but why drop the support for long ? > * Misc smaller changes from community feedback (thanks!) > > Comments welcomed! > Gavin R?mi > >> On 12 Apr 2018, at 22:27, Gavin Bierman wrote: >> >> I have uploaded a draft spec for JEP 325: Switch expressions at >> http://cr.openjdk.java.net/~gbierman/switch-expressions.html >> >> Note there are still three things missing: >> >> * There is no text about typing a switch expression, as this is still being >> discussed on this list. >> * There is no name given for the exception raised at runtime when a switch >> expression fails to find a matching pattern label, as this is still being >> discussed on this list. >> * The spec currently permits fall through from a "case pattern:? statement group >> into a "case pattern ->" clause. We are still working through the consequences >> of removing this possibility. >> >> Comments welcomed! > > Gavin From forax at univ-mlv.fr Sun Apr 29 21:41:44 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 29 Apr 2018 23:41:44 +0200 (CEST) Subject: `case null:` (here we go) In-Reply-To: References: Message-ID: <326232349.2862.1525038104745.JavaMail.zimbra@u-pem.fr> Hi Kevin, depending on the syntax constructions, in presence of null either you throw a NPE or you have a special semantics, by example try(var x= ...) silently doesn't call close() on null. I do not think we can pretend that null doesn't exist, we want to be able to switch on any objects, not just String and enums so null will show up, what i like with the current proposal is that - if you do not have a case null, you have a NPE which is backward compatible - if you have a case null, it's self documented. I also expect that most of switch where you want a case null are switch that doesn't require a default because the compiler will be smart enough to not requires a default. And as you said, price to paid is that 'default' means any object but null. Now that we have decided that the -> syntax doesn't allow fallthrough, i think we have no choice but to allow, default has a possible 'Pattern' in the grammar. One interesting question is should we allow 'case default' and see 'default' as shorter syntax for 'case default' ? And you can also even if Brian will say No discuss about the value of the keyword 'case', i.e. make 'case' optional everywhere. regards, R?mi PS: i'm not sure that if we go back in 2004, we will change the semantics of the switch because designing for something that may occur 14 later is not a good idea, moreover, by going back in time we have already changed what the future will be :) De: "Kevin Bourrillion" ?: "Gavin Bierman" Cc: "amber-spec-experts" Envoy?: Vendredi 27 Avril 2018 17:59:45 Objet: `case null:` (here we go) BQ_BEGIN >From newest spec: BQ_BEGIN ClauseLabel : BQ_END BQ_BEGIN BQ_BEGIN CasePattern -> BQ_END BQ_END BQ_BEGIN CasePattern : BQ_END BQ_BEGIN BQ_BEGIN case Pattern { , Pattern } default BQ_END BQ_END BQ_BEGIN Pattern : BQ_END BQ_BEGIN BQ_BEGIN ConstantExpression EnumConstantName null BQ_END BQ_END Just a reminder that we still have this conflict to resolve. Even when you learn the wart that `default` is not covering null, there is no way to make it do so without repeating the RHS. We need `default, case null:`. Of course, my preferred resolution is to hold off on `case null` support for now. I'd like to give one honest attempt at a full argument; I hope to get an engaged response to it, and then if a decision is made against it, I'll move on, figure I was wrong, and never speak of it again. :-) ~~ 1. For the sake of argument I'll just concede the notion that today's null behavior in switch was a mistake and if we could turn back time we would change it. 2. On the flip side, I think that proponents of the feature can probably concede that we would never have designed it in the now-proposed way for a fresh language; it is a permanent wart brought about by historical accident only. (Yes?) So, I assume that proponents recognize that rejecting this change is at least defensible , by appealing to the compatibility constraints we inherited. We should not need to worry that we will "look like idiots" (or whatever terms our deepest fears phrase themselves in :-)). 3. Many users since 2004 have said, and will continue to say, that they wish `case null` were allowed. However, I don't think we can assume they are necessarily comparing today's behavior to the actual feature we are proposing, with its warts. It is likely and natural that they are really comparing today's behavior to the time-machine feature of our having supported `case null` from the start. Therefore I think we have to take most such requests with a grain of salt. 4. Yes, sometimes users do write code that simulates `case null` and that code could be nicely simplified if null were allowed in switch. I can do a better job of quantifying the incidence of this need in our large codebase if necessary (but have been prioritizing string literal research for now). But fundamentally, the feature is a win, for these examples. 5. It has been implied that patterns are what make the current null treatment "untenable". To me, I don't think this argument has been made convincingly yet. It seems to add up to "there may be a few more of those cases where you bump up against the prohibition and think 'oh yeah, grrrr, can't switch directly on null because of reasons no one understands!'" But fundamentally it seems like the same problem it already was. 6. That benefit has to be weighed against the damage we will be causing. Here is the meat of it: - `default` will no longer mean default. There is really no way around that. - Null will be treated unaccountably differently from all other values in switch. It becomes harder to explain how switch works -- "sorry, no null" is at least easy. Instead it's "well, switch itself allows null, but it assumes you want a `case null` that throws if you don't say otherwise". Looked at without knowing all the baggage, is this not a bit bizarre? - Also (back to how this email started), this appears to be the only factor forcing us to introduce a `default, case x` syntax we would never otherwise need - or to mint some other bespoke construction we would, again, never otherwise need. >From where I sit, the cost is clearly too great compared to the benefit. While "never doing anything at all about this" might not be the solution, I am at least confident that the current proposal is not the right solution either, and I'd like to convince us to bench it. But again - if we can make either decision clearly, I'll be done here, and not even grumpy. On Fri, Apr 27, 2018 at 8:03 AM, Gavin Bierman < [ mailto:gavin.bierman at oracle.com | gavin.bierman at oracle.com ] > wrote: BQ_BEGIN I have uploaded the latest draft of the spec for JEP 325 at [ http://cr.openjdk.java.net/~gbierman/switch-expressions.html | http://cr.openjdk.java.net/~gbierman/switch-expressions.html ] Changes from the last version: * Supports new -> label form in both switch expressions and switch statements * Added typing rules for switch expression * Restrict the type of a selector expression to not include long, double and float as previously proposed * Misc smaller changes from community feedback (thanks!) Comments welcomed! Gavin > On 12 Apr 2018, at 22:27, Gavin Bierman < [ mailto:gavin.bierman at oracle.com | gavin.bierman at oracle.com ] > wrote: > > I have uploaded a draft spec for JEP 325: Switch expressions at [ http://cr.openjdk.java.net/~gbierman/switch-expressions.html | http://cr.openjdk.java.net/~gbierman/switch-expressions.html ] > > Note there are still three things missing: > > * There is no text about typing a switch expression, as this is still being discussed on this list. > * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. > * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. > > Comments welcomed! > Gavin BQ_END -- Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | kevinb at google.com ] BQ_END -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Mon Apr 30 15:39:10 2018 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 30 Apr 2018 08:39:10 -0700 Subject: JEP325: Switch expressions spec In-Reply-To: <1519264476.2415564.1525025664083.JavaMail.zimbra@u-pem.fr> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> <1519264476.2415564.1525025664083.JavaMail.zimbra@u-pem.fr> Message-ID: On Sun, Apr 29, 2018 at 11:14 AM, Remi Forax wrote: I do not care about double and float, doing == on a double or a float is > already dubious, but why drop the support for long ? > I think the set should be expanded all at once. It's noisy otherwise. Secondarily to that I argued that it was better to wait and do them all together with patterns. > > * Misc smaller changes from community feedback (thanks!) > > > > Comments welcomed! > > Gavin > > R?mi > > > > >> On 12 Apr 2018, at 22:27, Gavin Bierman > wrote: > >> > >> I have uploaded a draft spec for JEP 325: Switch expressions at > >> http://cr.openjdk.java.net/~gbierman/switch-expressions.html > >> > >> Note there are still three things missing: > >> > >> * There is no text about typing a switch expression, as this is still > being > >> discussed on this list. > >> * There is no name given for the exception raised at runtime when a > switch > >> expression fails to find a matching pattern label, as this is still > being > >> discussed on this list. > >> * The spec currently permits fall through from a "case pattern:? > statement group > >> into a "case pattern ->" clause. We are still working through the > consequences > >> of removing this possibility. > >> > >> Comments welcomed! > > > Gavin > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Apr 30 17:31:22 2018 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 30 Apr 2018 13:31:22 -0400 Subject: JEP325: Switch expressions spec In-Reply-To: <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> Message-ID: <13FE7F2A-CF9E-4399-B197-33A5F594DF6D@oracle.com> > On Apr 27, 2018, at 11:03 AM, Gavin Bierman wrote: > > I have uploaded the latest draft of the spec for JEP 325 at http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > Changes from the last version: > * Supports new -> label form in both switch expressions and switch statements > * Added typing rules for switch expression > * Restrict the type of a selector expression to not include long, double and float as previously proposed > * Misc smaller changes from community feedback (thanks!) > > Comments welcomed! Note the (certainly currently intentional) lack of structural parallelism in these two parts of the BNF in Section 4.11: __________________________________ SwitchStatementClause: ClauseLabel Statement ClauseLabel: CasePattern -> __________________________________ SwitchBlockStatementGroup: GroupLabels BlockStatements GroupLabels: GroupLabel {GroupLabel} GroupLabel: CasePattern : __________________________________ They can easily be made structurally parallel by changing the first two cited rules to: __________________________________ SwitchStatementClause: ClauseLabels Statement ClauseLabels: ClauseLabel {ClauseLabel} ClauseLabel: CasePattern -> __________________________________ which of course results in precisely my earlier proposal to allow multiple clause labels on a single statement?this note is merely to point out that it is an easy and unsurprising change to the BNF. Then in Section 15.29 one need only change __________________________________ SwitchExpressionClause: ClauseLabel Expression ; ClauseLabel Block ClauseLabel ThrowStatement __________________________________ to __________________________________ SwitchExpressionClause: ClauseLabels Expression ; ClauseLabels Block ClauseLabels ThrowStatement __________________________________ ?It?s not fallthrough?it?s multiple labels.? ?Guy From emcmanus at google.com Mon Apr 30 16:13:54 2018 From: emcmanus at google.com (=?UTF-8?Q?=C3=89amonn_McManus?=) Date: Mon, 30 Apr 2018 16:13:54 +0000 Subject: JEP325: Switch expressions spec In-Reply-To: <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> References: <8E28CEE7-0F85-485A-9AE7-15801522B06C@oracle.com> <0CF1227E-79B4-48A1-A227-86466EDB6219@oracle.com> Message-ID: I believe the grammar is ambiguous regarding `->`. If you have case a -> b -> c then in principle it could mean (1) when the selector expression equals `a` the value is `b -> c`, or (2) when the selector expression equals `a -> b` the value is `c`. Of course (2) is excluded semantically but I think it could be excluded syntactically just by changing the definition of ConstantExpression from ConstantExpression: Expression to ConstantExpression: AssignmentExpression On Fri, 27 Apr 2018 at 08:14, Gavin Bierman wrote: > I have uploaded the latest draft of the spec for JEP 325 at http://cr.openjdk.java.net/~gbierman/switch-expressions.html > Changes from the last version: > * Supports new -> label form in both switch expressions and switch statements > * Added typing rules for switch expression > * Restrict the type of a selector expression to not include long, double and float as previously proposed > * Misc smaller changes from community feedback (thanks!) > Comments welcomed! > Gavin > > On 12 Apr 2018, at 22:27, Gavin Bierman wrote: > > > > I have uploaded a draft spec for JEP 325: Switch expressions at http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > > > Note there are still three things missing: > > > > * There is no text about typing a switch expression, as this is still being discussed on this list. > > * There is no name given for the exception raised at runtime when a switch expression fails to find a matching pattern label, as this is still being discussed on this list. > > * The spec currently permits fall through from a "case pattern:? statement group into a "case pattern ->" clause. We are still working through the consequences of removing this possibility. > > > > Comments welcomed! > > Gavin