From brian.goetz at oracle.com Wed Jan 2 18:21:39 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 2 Jan 2019 13:21:39 -0500 Subject: Raw string literals -- restarting the discussion Message-ID: <8892F1AE-D816-4B97-AD8A-548CFA6B4744@oracle.com> As many of you saw, we pulled back the Raw String Literals feature from JDK 12. The public statement is here: http://mail.openjdk.java.net/pipermail/jdk-dev/2018-December/002402.html So, let's restart the design discussion. First, I want to enumerate some of the process errors I think we made. - We never really explored the full design space. The initial proposal had a reasonable syntactic strawman, and rather than explore the entire space, we mostly followed the path of refining the initial strawman, and stopped there. - We got caught in the "linear thinking" trap with respect to the design center. We started off thinking of this feature as "raw strings", of which multi-line strings are an important sub-case, but in reality most of the user pain is over dealing with multi-line snippets of HTML, JSON, XML, or SQL, and raw-ness is secondary. We never really made this turn. - We were too focused on getting the last 2% rather than the first 98%. (Note that for many, perhaps most language features, the last 2% is critical; for this one, which is entirely about syntactic convenience, it is not.) Specifically, by focusing on self-embedding as a test of fitness rather than more typical use cases, we ended up in a place that was both more complex than necessary, and at the same time, still had prominent anomalies. (Anomalies are unavoidable if we are unwilling to take on a super-ugly syntax, but we do have some control over how obvious and prominent they are.) From my "language steward" perspective, my main problem is that the two forms of string literals in the current proposal are gratuitously unrelated. They are syntactically unrelated (different delimiters and delimiter arity rules), and semantically unrelated (one must be raw and permits multiple lines; the other cannot be raw and cannot be multiple line.) I would prefer to have a single string literal feature, with some sub-options for controlling raw-ness and/or line spanning -- with bonus points if these are orthogonal aspects. (As a sub-concern, I would strongly prefer we not burn the backtick character as a delimiter; it should be entirely possible to avoid this by building on the existing string literal mechanism.) So, how should we evaluate success here? This feature doesn't improve the expressiveness or abstractive ability of the language at all -- it's purely about syntactic convenience. And, given that we've limped along for 20+ years without it, it's lack can't be all _that_ problematic. So let's identify the use cases we care about most, and evaluate the feature through the lens of how it helps those use cases. In my opinion, these are: - Multi-line snippets of JSON, HTML, XML, and SQL embedded in Java code as string literals. (Other languages are used too, but these constitute the majority.) These currently require escaping for quotes and for newlines, which means every such snippet requires substantial surgery. This is painful for code writers (though IDEs can do most of the lifting here), but more importantly, is harder to read, and it is really easy to leave out a `\n` and get the wrong result, and not have it be immediately noticeable. We would like for most such snippets to be simply pastable without modification. - Regular expressions and Windows paths routinely require escaping, which again is easy to get wrong and hard to read. (Regular expressions are hard enough to read, we don't need to make it harder.) These are typically a single line. Given that this feature is pure convenience, we'd also like to avoid excessive spending of our complexity budgets -- either language complexity or teachability. Grabbing for that last 2% at the expense of either of these is not a good trade. Note too that there is no ideal answer here; we can see this quite clearly by looking at the variety of choices other languages have made, and each still has anomalies (e.g., python raw strings can't end with a backslash) or forces ugly complexity on the reader (e.g., user-selected nonces in C++ raw strings, or Rust's `#` characters). This is truly a "pick your poison" game. Let's remind ourselves of what other languages do in this area. In all these languages, raw strings can contain newlines; some have separate features for multi-line escaped strings and multi-line raw strings. - C simulates multi-line strings by having a continuation character (backslash) in the last column, or by implicitly concatenating adjacent string literals (`"raw" "string"`). It does not support raw strings, though there is a gcc extension that emulates C++ raw strings. - C++ supports multi-line strings through raw strings. It denotes raw strings with an `R` prefix before the quotes, and a user-selected nonce and parentheses inside the quotes: `R"NONCE(raw string)NONCE"`. The nonce may be empty, but the parens are required. - Rust supports multi-line strings by simply allowing newline characters in an ordinary string literal. It separately supports raw string literals with an `r` prefix, followed by a variable (can be zero) number of `#` characters, a double quote, the raw string, a double quote, and the same number of `#` characters: `r##"raw string"##`. - Python allows string literals to span multiple lines by using a three-quote (`"""`) delimiter. It allows raw string literals by prefixing the string literal with `r`. Its escaping rules for quotes in raw strings are unusual; a backslash preceded by a quote escapes the quote, but leaves the backspace in the string. (Accordingly, a raw string cannot end with a backslash.) - Ruby supports multi-line strings with here-docs, and raw strings using the `%q()` construct: `q(raw string)`. - C#, like C++, support multi-line strings through raw strings. A raw string precedes the string literal with an `@` character: `@"raw string"`. - Scala and Kotlin, like C++ and C#, support multi-line strings through raw strings. A raw string is delimited with triple quotes: `"""raw string"""`. Note too that there is also room for interpretation on the meaning of "raw"; Python permits some escaping in raw strings, and Kotlin permit interpolation in raw strings. We can divide the approaches roughly into three categories: - Those that use user-supplied nonces (C++, here-docs). These can render 100% of embedded strings, with the costs that come with nonces: annoying to write, and imposing cognitive load to read (as nearly any sequence can be a nonce.) - Those that use variable-sized delimiters (Rust, and our previous proposal). These are simpler, but will invariably have some anomalies. - Those that use fixed delimiters (C#, Scala). These are simpler still, and will have more anomalies. So, recapping our starting point and guidance: - The primarily use case is multi-line snippets of JSON, HTML, XML, and SQL. It is rare that these require true-raw-ness, but they all commonly have embedded quote characters. - The secondary use case is truly raw strings, of which the most common offenders are small-ish -- regular expressions and windows paths. - We should start by trying to extend existing string literals to support raw and/or multi-line strings. Some questions we need to answer: - What are reasonable delimiter choices for raw and/or multi-line strings? - Should the default treatment of multi-line strings be raw or escaped (alternately, is this one feature or two)? - Is raw-ness a property of a string literal, or a state that can change within the literal (i.e., with embedded start-raw/end-raw escape sequences)? - How do we embed delimiters in raw strings (escaping, doubling up, concatenation)? - How far do we want to go to support embedding of delimiters? Let's start by asking how we might extend the current string literal feature to support multi-line strings. Currently, a string literal starts with a double-quote, can span only a single line of source, and ends at the first unescaped double quote. How could we extend this to a multi-line string literal? Some possibilities include: - Simply remove the constraint of "can only span a single line"; no other change to delimiters is required (the Rust approach.) - Choose a different fixed delimiter, such as tripled quotes ("""), doubled single-quotes (''...''), or a multi-character quote token (`/"..."/`). - Use a modifier on the opening quote, such as `R"..."` or `@"..."` - Use an embedded escape sequence, such as `"\M..."`, to opt into multi-line treatment - Use here-docs, with a fixed or user-providable nonce I think its reasonable to eliminate here-docs from consideration as these are more typically associated with scripting languages. At first blush, the simplicity of the Rust approach is attractive; just let strings span multiple lines, with no new syntax. The obvious counter-arguments are pretty weak in the current age; if you code in IDE, as most developers do, it is not easy to accidentally leave off a closing quote, and the syntax highlighting will make this obvious in the event we do so anyway. But, if we look through the lens of our use cases -- such as JSON snippets -- we see that this approach fails almost completely, because you _still_ have to escape the quotes, and almost all multi-line snippets will have quotes. So, let's cross this off too. The same applies to using a letter prefix for multi-line strings; it doesn't address the primary use case. Note too that our primary use case admits a middle-ground option: multi-line strings are not raw, but quotes need not be escaped. This is a possibility if the delimiter is anything other than a single double-quote (`"`). So, some reasonable starting points on this front include: - Just follow C#/Scala/Kotlin, where there's a single mechanism for both raw and multi-line, delimited by triple-quotes. Here, a single (or double) embedded quote does not necessarily need to be escaped. - Use triple-quotes for non-raw multi-line string literals, and some sort of additional way to select raw-ness for either single- or triple-quoted string literals. (Same comment about embedded quotes.) - Same, but use doubled or tripled single-quotes. Within the "multiple quote" options, we can separately choose between a fixed number of quotes (e.g., 3) or a variable number (e.g., 3 or more, odd only, etc.) The trade-off here is about where the anomalies go; with the variable-number approaches, it gets harder to start or end with the delimiter character (while this is not necessarily a serious anomaly, but it is a prominent one), and with the fixed approach, there is more need to do something (escaping, concatenating, etc) the delimiter character (though embedding triple-quotes is not all that common in our primary use cases). Also, our IDE friends have pointed out that even numbers of quotes put the IDE in a quandary as to whether the user has just typed the opening delimiter, or both the opening and closing delimiters. Now, raw-ness. One option is to just say that multi-line strings are also raw. We have evidence that this is not totally unworkable, as several languages have gone this way, but it does mean that for the use cases where the user wants multi-line but not raw, they must resort either to concatenation, or explicit escape processing (e.g., `"""foo""".escape()`) Another is to allow a prefix character to indicate raw-ness; `R"foo"` or `R"""foo"""`. The prefix character approach is more extensible to other kinds of modes to string processing. Another option is to use a different delimiter, as the current proposal does. If we were to go this way, I'd suggest we consider double or triple single-quote (which are currently illegal), rather than continuing with backtick. A fourth option, one that has not yet been considered, is to say that raw-ness is a _state_ of processing a string literal; string literals start out escaped, but can drop into (and out of) raw-ness as they like: String s = "This part is escaped\n, but this part\- is raw, and this part\+ is escaped again." String path = "\-C:\bin\putty"; This gets us where multi-line-ness and raw-ness are orthogonal properties of string literals -- without requiring any new delimiters. So, how to proceed? First, let's try to avoid focusing on our own personal preferences, or be distracted by unfamiliarity, and remember that our job here is to get to a design that's best for _tomorrow's_ Java developers and source base. (That means that, for example, we can't allow ourselves to be distracted by the fact that, say, embedded "\-" or `R"..."` is unfamiliar today. It will be familiar tomorrow, if we decide that's what would be best.) Here's what would be super-useful: - Data that supports or refutes the claim that our primary use cases are embedded JSON, HTML, XML, and SQL. - Use cases we've left out, for which we can discuss whether we want to incorporate them into our goals. - Data (either Java or non-Java) on the use of various flavors of strings (raw, multi-line, etc) in real codebases, which might be useful to help determine, for example, whether raw and multi-line should be lumped into the same bucket or not. The bike shed is open (but please show up with structural members, not just paint.) From brian.goetz at oracle.com Wed Jan 2 18:53:39 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 2 Jan 2019 13:53:39 -0500 Subject: Flow scoping In-Reply-To: References: Message-ID: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> > Before we get to the harder ones, can people share what made them uncomfortable, for the ones where you marked "not comfortable with the answer?? We?re currently exploring the consequences of Doug?s query about introducing non-conditional locals into expressions (not necessarily because we want to have such a feature now, but because if we did want such a feature, we?d want the scoping machinery to handle it well, and if it doesn?t, that may suggest refinements for the scoping machinery.) Let?s talk about the more complicated examples in the quiz. Everyone seemed to do pretty well on the if/else ones (flip the condition, and it flips the scopes), the conditional-expression ones, and the short-circuiting examples. Let?s talk about merging. In this example: ``` public void test() { if (a instanceof String v || b instanceof String v) { println(v.length()); } else { println("foo"); } } ``` we have two candidate conditional variables called `v` of type `String`, and the flow rules ensure that at any point, at most one of them could be DA. So in the `true` arm, the use of `v` is OK, and it binds to whichever of these is DA (both must have the same type, otherwise its an error.) The rules sound complicated, but as always, our DA intuition guides us pretty well ? we can?t get to the `true` arm unless exactly one of the `instanceof` expressions has matched. In the `&&` version of this example, people?s intuition was mostly right ? that this code is illegal ? but some were unsure: ``` public void test() { if (a instanceof String v && b instanceof String v) { println(v.length()); } else { println("foo"); } } ``` The answer here is that because `v` is not DU in the second `instanceof`, and therefore might be in scope, this constitutes an illegal shadowing, and the compiler rejects it. The hard ones (and the potentially controversial ones) were the last two ? regarding use _after_ the declaring statement: ``` public void test() { if (!(a instanceof String v)) throw new NotAStringException("a"); if (v.length() == 0) throw new EmptyStringException(); println(v.length()); } ``` ``` public void test() { if (!(a instanceof String v)) return; println(v.length()); } ``` In both of these cases, flow scoping allows `v` to be in scope in the last line ? because `v` is DA at the point of use. And this is because if the `insteanceof` does not match, control does not make it to the use. In other words, we can use the full flow analysis ? including reachability ? to conclude the only way we could reach the use is if the binding is defined. (If there was not a return or a throw, then the last use would not be in scope.) For some, this is uncomfortable at first, as not only does the scope bleed into the full `if` statement, but it bleeds out of it too. But again, we?re driven by two goals: - Make scoping line up with DA-ness (so we can build on an existing mechanism rather than creating a new one) - Be friendly to refactoring. In the latter point, I?d like for if (E) { throw; } else { s } to be refactorable to if (E) throw; s; as it is common that users will perform precondition checking on entry to a method, throw if there?s a failure, and then ?fall? into the main body of the method: if (!(a instanceof String s)) throw new IllegalARgumentException(a.toString()); // use s here The main argument against supporting this is discomfort; it feels new and different. (But, because we?re building on DA, it?s not, really.) From james.laskey at oracle.com Wed Jan 2 19:00:05 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Wed, 2 Jan 2019 15:00:05 -0400 Subject: Enhancing Java String Literals Round 2 Message-ID: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> > > http://cr.openjdk.java.net/~jlaskey/Strings/RTL2/index.html > http://cr.openjdk.java.net/~jlaskey/Strings/RTL2.pdf > First of all, I would like to apologize for leading us down the garden path re Java Raw String Literals. I jumped into this feature fully enamoured with the JavaScript equivalent and, "why can't we have this in Java?" As the proposal evolved, it became clear that what we came up with was not a good Java solution. I underestimated the concern that the original proposal was too left field and did not fit into Java very well. It's somewhat ironic that the backtick looks like a thorn. > > So, let's start the new year with a structured approach to the enhance string literal design. Brian gave a summary of why the old design fails. Starting with this summary, Brian and I talked out a series of critical decision points that should be given thought, if not answers, before we propose a new design. As an exercise, I supplemented these points and created a series of small decision trees (a full on decision tree would be complex and not very helpful.) I found these trees good intuition pumps for getting the design at least 80% there. Hopefully, this exercise will help you in the same way. > > > > > Even the label Raw String Literal put the emphasis on the wrong part of the feature. What developers really want is multi-line strings. They want to be able to paste alien source into their Java programs with as little fuss as possible. > > String raw-ness (not translating escapes) is a tangential aspect, that may or may not be needed to implement multi-line strings. Yes, the regex and Window's file path arguments in JEP 326 are still valid, but this aspect needs to be separated from the main part of the design. Further in the discussion, we'll see that raw-ness is really a many-headed hydra, best slain one head at a time. > > > > > We have to be honest. We know Java's primary market. Sure we want to embed Java in Java for writing tests. Sure there is JavaScript and CSS in web pages. Nevertheless, most uses of multi-line will be for non-complex grammars. Specifically, grammars that don't require special handling of multi-character delimiter sequences. If you can accept this, then the solution set is much smaller. > > > > > This is an easy one. Familiarity is key to feature education. Radical wandering off with new syntax is not helpful to anyone but bloggers and authors. > > > > > If you buy into the familiarity argument, then double quote is really only choice for a delimiter. Double quote already indicates a string literal. Single quote indicates a character. We don?t want to gratuitously burn unused symbols like backtick. Backslash works for regex but maybe not for others. Combinations and nonces just introduce new noise when our original goal was to reduce noise and complexity. > > > > > Other languages avoid delimiter escape sequences by doubling up. Example, "abc""def" -> abc"def. This concept is unfamiliar to Java developers, why change now. Escape sequences are what we know. > > > > > Language designers got very nervous when I suggested infinite delimiter sequences in the original proposal; lexically sacrilegious. I felt strongly that it was easy to explain and only 1 in 1M developers would ever use more than 4-5 character delimiter sequences. In round two, I have come to agree. This was taking on more complexity than is really warranted, for a use case that doesn?t come along very often. I suggest we only need single and triple double quotes. A single double quote works today, so no argument there. Double double quotes means empty string, no problem. Triple double quotes are only necessary to avoid having to escape quotes in alien source. > > String json = """ > { > "name": "Jean Smith", > "age": 32, > "location": "San Jose" > } > """; > > versus > > String json = " > { > \"name\": \"Jean Smith\", > \"age\": 32, > \"location\": \"San Jose\" > } > "; > > This second case is where we wandered off the tracks with raw-ness. We assumed raw-ness is necessary to avoid all the backslashes. Most cases can be handled with triple double quotes. > > Okay, so why not more combinations? Simply because, most of the time they are not needed. On the rare occasion we do have nested triple double quotes, we can then use escape sequences. > > String nestedJSON = """ > \"\"\" > { > "name": "Jean Smith", > "age": 32, > "location": "San Jose" > } > \"\"\"; > """; > > or better yet, you only have to escape every third double quote > > String nestedJSON = """ > \""" > { > "name": "Jean Smith", > "age": 32, > "location": "San Jose" > } > \"""; > """; > > Not so evil and it's familiar. > > > > > Meaning, you can only use single quotes for simple strings and triple quotes for multi-line strings. I don't have a strong opinion other than it seems like an unneeded restriction. The only argument I've heard has been for better error recovery when missing a close delimiter during parsing. My counter for that argument is that if you are processing multi-line strings then you can easily track the first newline after the opening delimiter and recover from there. I implemented that recovery in javac and worked out well. > > > > > > Cooked (translated escape sequences) should be the default. Why should a multi-line string be different than a simple string? We have a solution for embedding double quote. Single quotes don't require escaping. Tabs and newlines can exist as is. Unicode characters can be either an escape sequence or the unicode character. So the only problem case is backslash. I would argue that the rare backslash can be escaped. If not, then the developer can use the raw-ness solution. > > > > > If we don't translate newlines, then source is not transferable across platforms. That is, a source from one platform may not execute the same way on another platform. Translating consistently guarantees execution consistency. As a note, programming languages that didn't translate newlines in multi-line string literals typically regretted it later (Python.) > > > > > With the original Raw String Literal proposal, there was concern about leading and trailing nested delimiters. If we default to cooked strings, then we use can use \". > > > > > These questions have been answered numerous times and fall into the realm of library support. Same arguments as before, same outcome. > > > To summarize the bold paths at this point; > - multi-line strings are an extension of traditional simple strings > - newlines in a string are no longer an error and the string can extend across several lines > - error recovery can pick up at the first newline after the opening delimiter > - multi-line strings process escape sequences (including unicode) in the same way as simple strings > - multiple double quotes are handled with escape sequences > - triple double quote delimiter is introduced to avoid escaping simple double quote sequences > > Generally, I think this is very much in the traditional Java spirit. > > > Now, let's move on to the lesser but more interesting issue. As I stated above, raw-ness is a multi-headed beast. Raw-ness involves the turning off the translation of > - escape sequences > - unicode escapes > - delimiter sequences > - escape sequence prefix (backslash) > - tabs and newlines (control characters in general) > > Sometimes we need all of the translations, sometimes few and sometimes none. In the multi-line discussion above, we see we don't need raw as much as we might have expected. Maybe for occasional backslashes, as in regex and Windows paths strings. > > > > > > The original Raw String Literal proposal suggested that raw-ness was a property of the whole string literal and thus we proposed an alternate delimiter syntax just to emphasize that fact. If we accept the bold path of multi-line discussion above, then alternate delimiter is out. This leaves prefixing as the best option to bless a string literal with raw-ness. > > At this point, I would like to suggest an alternate, maybe progressive way to think of raw-ness. Since the original proposal, I have been thinking of raw-ness as a state of processing the literal. State is certainly obvious in the scanner implementation, why not raise that to the language level? If it is a state then we should be able to enter and leave that state in some way. Escape sequences are an obvious way of transitioning translation in the string. \- and \+ are available and not currently recognized as valid escape sequences, why not \- and \+ to toggle escape processing? > > String a = "cooked \-raw\+ cooked"; // cooked raw cooked - a little odd but not so much so > String b = "abc\-\\\\\+def"; // abc\\\\def - struggling > String c = "\-abc\\\\def"; // abc\\\\def - more readable as an inner prefix > String d = "abc\-\-def\+\+ghi"; // abc\-def\+ghi - raw on "\-" is "\" and "-", raw off "\+" is "\" and "+" > String e = """\-"abc"\+"""; // "abc" - \- and \+ act a no-ops of sorts > > Comparing property vs state: > > Runtime.getRuntime().exec(R""" "C:\Program Files\foo" bar""".strip()); > Runtime.getRuntime().exec("""\-"C:\Program Files\foo" bar"""); > > System.out.println("this".matches(R"\w\w\w\w")); > System.out.println("this".matches("\-\w\w\w\w")); > > String html = R""" > > >

Hello World.

> > > """.align(); > String html = """\- > > >

Hello World.

> > > """.align(); > > > String nested = """ > String EXAMPLE_TEST = "This is my small example " > + "string which I'm going to " > + "use for pattern matching."; > """ + > R""" > System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); > """; > String nested = """ > String EXAMPLE_TEST = "This is my small example " > + "string which I'm going to " > + "use for pattern matching."; > \- > System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); > \+ > """; > > Hopefully, this is a good starting point for discussion. As before, I'm pragmatic about which direction we go, so feel free to comment. > > Cheers, > > -- Jim > > > > > > > > > > From james.laskey at oracle.com Wed Jan 2 19:04:51 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Wed, 2 Jan 2019 15:04:51 -0400 Subject: Enhancing Java String Literals Round 2 In-Reply-To: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> Message-ID: <24CA86D8-B59A-4D27-B303-B29134E5CD71@oracle.com> Diagrams were stripped, follow one of the two links to view in full. Cheers, ? Jim > On Jan 2, 2019, at 3:00 PM, Jim Laskey wrote: > > >> >> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2/index.html >> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2.pdf >> First of all, I would like to apologize for leading us down the garden path re Java Raw String Literals. I jumped into this feature fully enamoured with the JavaScript equivalent and, "why can't we have this in Java?" As the proposal evolved, it became clear that what we came up with was not a good Java solution. I underestimated the concern that the original proposal was too left field and did not fit into Java very well. It's somewhat ironic that the backtick looks like a thorn. >> >> So, let's start the new year with a structured approach to the enhance string literal design. Brian gave a summary of why the old design fails. Starting with this summary, Brian and I talked out a series of critical decision points that should be given thought, if not answers, before we propose a new design. As an exercise, I supplemented these points and created a series of small decision trees (a full on decision tree would be complex and not very helpful.) I found these trees good intuition pumps for getting the design at least 80% there. Hopefully, this exercise will help you in the same way. >> >> >> >> >> Even the label Raw String Literal put the emphasis on the wrong part of the feature. What developers really want is multi-line strings. They want to be able to paste alien source into their Java programs with as little fuss as possible. >> >> String raw-ness (not translating escapes) is a tangential aspect, that may or may not be needed to implement multi-line strings. Yes, the regex and Window's file path arguments in JEP 326 are still valid, but this aspect needs to be separated from the main part of the design. Further in the discussion, we'll see that raw-ness is really a many-headed hydra, best slain one head at a time. >> >> >> >> >> We have to be honest. We know Java's primary market. Sure we want to embed Java in Java for writing tests. Sure there is JavaScript and CSS in web pages. Nevertheless, most uses of multi-line will be for non-complex grammars. Specifically, grammars that don't require special handling of multi-character delimiter sequences. If you can accept this, then the solution set is much smaller. >> >> >> >> >> This is an easy one. Familiarity is key to feature education. Radical wandering off with new syntax is not helpful to anyone but bloggers and authors. >> >> >> >> >> If you buy into the familiarity argument, then double quote is really only choice for a delimiter. Double quote already indicates a string literal. Single quote indicates a character. We don?t want to gratuitously burn unused symbols like backtick. Backslash works for regex but maybe not for others. Combinations and nonces just introduce new noise when our original goal was to reduce noise and complexity. >> >> >> >> >> Other languages avoid delimiter escape sequences by doubling up. Example, "abc""def" -> abc"def. This concept is unfamiliar to Java developers, why change now. Escape sequences are what we know. >> >> >> >> >> Language designers got very nervous when I suggested infinite delimiter sequences in the original proposal; lexically sacrilegious. I felt strongly that it was easy to explain and only 1 in 1M developers would ever use more than 4-5 character delimiter sequences. In round two, I have come to agree. This was taking on more complexity than is really warranted, for a use case that doesn?t come along very often. I suggest we only need single and triple double quotes. A single double quote works today, so no argument there. Double double quotes means empty string, no problem. Triple double quotes are only necessary to avoid having to escape quotes in alien source. >> >> String json = """ >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> """; >> >> versus >> >> String json = " >> { >> \"name\": \"Jean Smith\", >> \"age\": 32, >> \"location\": \"San Jose\" >> } >> "; >> >> This second case is where we wandered off the tracks with raw-ness. We assumed raw-ness is necessary to avoid all the backslashes. Most cases can be handled with triple double quotes. >> >> Okay, so why not more combinations? Simply because, most of the time they are not needed. On the rare occasion we do have nested triple double quotes, we can then use escape sequences. >> >> String nestedJSON = """ >> \"\"\" >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> \"\"\"; >> """; >> >> or better yet, you only have to escape every third double quote >> >> String nestedJSON = """ >> \""" >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> \"""; >> """; >> >> Not so evil and it's familiar. >> >> >> >> >> Meaning, you can only use single quotes for simple strings and triple quotes for multi-line strings. I don't have a strong opinion other than it seems like an unneeded restriction. The only argument I've heard has been for better error recovery when missing a close delimiter during parsing. My counter for that argument is that if you are processing multi-line strings then you can easily track the first newline after the opening delimiter and recover from there. I implemented that recovery in javac and worked out well. >> >> >> >> >> >> Cooked (translated escape sequences) should be the default. Why should a multi-line string be different than a simple string? We have a solution for embedding double quote. Single quotes don't require escaping. Tabs and newlines can exist as is. Unicode characters can be either an escape sequence or the unicode character. So the only problem case is backslash. I would argue that the rare backslash can be escaped. If not, then the developer can use the raw-ness solution. >> >> >> >> >> If we don't translate newlines, then source is not transferable across platforms. That is, a source from one platform may not execute the same way on another platform. Translating consistently guarantees execution consistency. As a note, programming languages that didn't translate newlines in multi-line string literals typically regretted it later (Python.) >> >> >> >> >> With the original Raw String Literal proposal, there was concern about leading and trailing nested delimiters. If we default to cooked strings, then we use can use \". >> >> >> >> >> These questions have been answered numerous times and fall into the realm of library support. Same arguments as before, same outcome. >> >> >> To summarize the bold paths at this point; >> - multi-line strings are an extension of traditional simple strings >> - newlines in a string are no longer an error and the string can extend across several lines >> - error recovery can pick up at the first newline after the opening delimiter >> - multi-line strings process escape sequences (including unicode) in the same way as simple strings >> - multiple double quotes are handled with escape sequences >> - triple double quote delimiter is introduced to avoid escaping simple double quote sequences >> >> Generally, I think this is very much in the traditional Java spirit. >> >> >> Now, let's move on to the lesser but more interesting issue. As I stated above, raw-ness is a multi-headed beast. Raw-ness involves the turning off the translation of >> - escape sequences >> - unicode escapes >> - delimiter sequences >> - escape sequence prefix (backslash) >> - tabs and newlines (control characters in general) >> >> Sometimes we need all of the translations, sometimes few and sometimes none. In the multi-line discussion above, we see we don't need raw as much as we might have expected. Maybe for occasional backslashes, as in regex and Windows paths strings. >> >> >> >> >> >> The original Raw String Literal proposal suggested that raw-ness was a property of the whole string literal and thus we proposed an alternate delimiter syntax just to emphasize that fact. If we accept the bold path of multi-line discussion above, then alternate delimiter is out. This leaves prefixing as the best option to bless a string literal with raw-ness. >> >> At this point, I would like to suggest an alternate, maybe progressive way to think of raw-ness. Since the original proposal, I have been thinking of raw-ness as a state of processing the literal. State is certainly obvious in the scanner implementation, why not raise that to the language level? If it is a state then we should be able to enter and leave that state in some way. Escape sequences are an obvious way of transitioning translation in the string. \- and \+ are available and not currently recognized as valid escape sequences, why not \- and \+ to toggle escape processing? >> >> String a = "cooked \-raw\+ cooked"; // cooked raw cooked - a little odd but not so much so >> String b = "abc\-\\\\\+def"; // abc\\\\def - struggling >> String c = "\-abc\\\\def"; // abc\\\\def - more readable as an inner prefix >> String d = "abc\-\-def\+\+ghi"; // abc\-def\+ghi - raw on "\-" is "\" and "-", raw off "\+" is "\" and "+" >> String e = """\-"abc"\+"""; // "abc" - \- and \+ act a no-ops of sorts >> >> Comparing property vs state: >> >> Runtime.getRuntime().exec(R""" "C:\Program Files\foo" bar""".strip()); >> Runtime.getRuntime().exec("""\-"C:\Program Files\foo" bar"""); >> >> System.out.println("this".matches(R"\w\w\w\w")); >> System.out.println("this".matches("\-\w\w\w\w")); >> >> String html = R""" >> >> >>

Hello World.

>> >> >> """.align(); >> String html = """\- >> >> >>

Hello World.

>> >> >> """.align(); >> >> >> String nested = """ >> String EXAMPLE_TEST = "This is my small example " >> + "string which I'm going to " >> + "use for pattern matching."; >> """ + >> R""" >> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >> """; >> String nested = """ >> String EXAMPLE_TEST = "This is my small example " >> + "string which I'm going to " >> + "use for pattern matching."; >> \- >> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >> \+ >> """; >> >> Hopefully, this is a good starting point for discussion. As before, I'm pragmatic about which direction we go, so feel free to comment. >> >> Cheers, >> >> -- Jim >> >> >> >> >> >> >> >> >> >> From dl at cs.oswego.edu Fri Jan 4 12:39:30 2019 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 4 Jan 2019 07:39:30 -0500 Subject: Flow scoping In-Reply-To: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> Message-ID: On 1/2/19 1:53 PM, Brian Goetz wrote: > The hard ones (and the potentially controversial ones) were the last two ? regarding use _after_ the declaring statement: > > ``` > public void test() { > if (!(a instanceof String v)) > throw new NotAStringException("a"); > if (v.length() == 0) > throw new EmptyStringException(); > > println(v.length()); > } > ``` > > ``` > public void test() { > if (!(a instanceof String v)) > return; > > println(v.length()); > } > ``` > > In both of these cases, flow scoping allows `v` to be in scope in the last line ? because `v` is DA at the point of use. And this is because if the `insteanceof` does not match, control does not make it to the use. In other words, we can use the full flow analysis ? including reachability ? to conclude the only way we could reach the use is if the binding is defined. (If there was not a return or a throw, then the last use would not be in scope.) > > For some, this is uncomfortable at first, as not only does the scope bleed into the full `if` statement, but it bleeds out of it too. This seems readily explainable in the same way that other "fall-throughs" work: There is an implicit else spanning the rest of the block. Which is just a syntactic convenience to decrease brace-levels. So it doesn't seem very worrisome on these grounds. Although I wonder whether style guides requiring explicit braces will require them here. -Doug From brian.goetz at oracle.com Fri Jan 4 13:47:23 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 4 Jan 2019 08:47:23 -0500 Subject: Flow scoping In-Reply-To: References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> Message-ID: <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> >> For some, this is uncomfortable at first, as not only does the scope bleed into the full `if` statement, but it bleeds out of it too. > > This seems readily explainable in the same way that other > "fall-throughs" work: There is an implicit else spanning the rest of the > block. Which is just a syntactic convenience to decrease brace-levels. > So it doesn't seem very worrisome on these grounds. This is certainly this intuition that guided us here; it should be possible to freely refactor if (e) throw x; else { stuff } to if (e) throw x; stuff; and it would be sad if we could not. > Although I wonder > whether style guides requiring explicit braces will require them here. Some style guides surely will, and that?s fine. From amaembo at gmail.com Fri Jan 4 14:07:44 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 4 Jan 2019 21:07:44 +0700 Subject: Flow scoping In-Reply-To: <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> Message-ID: Hello! > This is certainly this intuition that guided us here; it should be > possible to freely refactor > > if (e) > throw x; > else { stuff } > > to > > if (e) throw x; > stuff; > > and it would be sad if we could not. For the record: I heavily support this. If then-branch cannot complete normally, then unwrapping the else-branch should preserve the program semantics. It works today, and it should work in future Java as well. With best regards, Tagir Valeev. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Jan 4 14:17:35 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 4 Jan 2019 15:17:35 +0100 (CET) Subject: Flow scoping In-Reply-To: References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> Message-ID: <1510917122.325690.1546611455174.JavaMail.zimbra@u-pem.fr> > De: "Tagir Valeev" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Vendredi 4 Janvier 2019 15:07:44 > Objet: Re: Flow scoping > Hello! >> This is certainly this intuition that guided us here; it should be possible to >> freely refactor >> if (e) >> throw x; >> else { stuff } >> to >> if (e) throw x; >> stuff; >> and it would be sad if we could not. > For the record: I heavily support this. If then-branch cannot complete normally, > then unwrapping the else-branch should preserve the program semantics. It works > today, and it should work in future Java as well. so i am, as a user if you explicitly choose the instanceof form that introduce a local name, the local name should be introduced :) > With best regards, > Tagir Valeev. Happy new year, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Jan 6 17:43:19 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 6 Jan 2019 12:43:19 -0500 Subject: Enhancing Java String Literals Round 2 In-Reply-To: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> Message-ID: <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> As Reinier pointed out on amber-dev, regex strings may routinely contain escaped meta-characters ? +, *, brackets, etc. So the embedded \- and \+ story has an obvious conflict. While these are not the only possible characters for such ?shift? operators, his point that this might be overkill is a good one. So let?s look at options for denoting raw-ness. - Just make triple-quote strings always raw as well as multi-line-capable; regexes and friends would use TQ strings even though they are single line (Scala, Kotlin) - Letter prefix, such as R??? (C++, Rust) - Symbol prefix, such as @??? (C#), or \??? (suggestive of ?distributing? the escaping across the string.) - Embedded escape sequence that switches to raw mode, but can?t be switched back: ?\+raw string?, ?\{raw}raw string?. Data from Google suggests that, in their code base, on the order of 5% of candidates for multi-line strings use some escape sequences (Kevin/Liam, can you verify?) This suggests to me that the ?just use TQ? approach is vaguely workable, but likely to be error-prone (5% is infrequently enough that people will say \t when they mean tab and discover this at runtime, and then have to go back and add a .escape() call.) (Of these, my current favorite is using the backslash: ?cooked?, ???cooked and ML-capable?, \?raw?, \???raw and ML capable?. The use of \ suggests ?the backslashes have been pre-added for you?, building on existing associations with backslash.) Are there other credible candidates that I?ve missed? > On Jan 2, 2019, at 2:00 PM, Jim Laskey wrote: > > >> >> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2/index.html >> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2.pdf >> First of all, I would like to apologize for leading us down the garden path re Java Raw String Literals. I jumped into this feature fully enamoured with the JavaScript equivalent and, "why can't we have this in Java?" As the proposal evolved, it became clear that what we came up with was not a good Java solution. I underestimated the concern that the original proposal was too left field and did not fit into Java very well. It's somewhat ironic that the backtick looks like a thorn. >> >> So, let's start the new year with a structured approach to the enhance string literal design. Brian gave a summary of why the old design fails. Starting with this summary, Brian and I talked out a series of critical decision points that should be given thought, if not answers, before we propose a new design. As an exercise, I supplemented these points and created a series of small decision trees (a full on decision tree would be complex and not very helpful.) I found these trees good intuition pumps for getting the design at least 80% there. Hopefully, this exercise will help you in the same way. >> >> >> >> >> Even the label Raw String Literal put the emphasis on the wrong part of the feature. What developers really want is multi-line strings. They want to be able to paste alien source into their Java programs with as little fuss as possible. >> >> String raw-ness (not translating escapes) is a tangential aspect, that may or may not be needed to implement multi-line strings. Yes, the regex and Window's file path arguments in JEP 326 are still valid, but this aspect needs to be separated from the main part of the design. Further in the discussion, we'll see that raw-ness is really a many-headed hydra, best slain one head at a time. >> >> >> >> >> We have to be honest. We know Java's primary market. Sure we want to embed Java in Java for writing tests. Sure there is JavaScript and CSS in web pages. Nevertheless, most uses of multi-line will be for non-complex grammars. Specifically, grammars that don't require special handling of multi-character delimiter sequences. If you can accept this, then the solution set is much smaller. >> >> >> >> >> This is an easy one. Familiarity is key to feature education. Radical wandering off with new syntax is not helpful to anyone but bloggers and authors. >> >> >> >> >> If you buy into the familiarity argument, then double quote is really only choice for a delimiter. Double quote already indicates a string literal. Single quote indicates a character. We don?t want to gratuitously burn unused symbols like backtick. Backslash works for regex but maybe not for others. Combinations and nonces just introduce new noise when our original goal was to reduce noise and complexity. >> >> >> >> >> Other languages avoid delimiter escape sequences by doubling up. Example, "abc""def" -> abc"def. This concept is unfamiliar to Java developers, why change now. Escape sequences are what we know. >> >> >> >> >> Language designers got very nervous when I suggested infinite delimiter sequences in the original proposal; lexically sacrilegious. I felt strongly that it was easy to explain and only 1 in 1M developers would ever use more than 4-5 character delimiter sequences. In round two, I have come to agree. This was taking on more complexity than is really warranted, for a use case that doesn?t come along very often. I suggest we only need single and triple double quotes. A single double quote works today, so no argument there. Double double quotes means empty string, no problem. Triple double quotes are only necessary to avoid having to escape quotes in alien source. >> >> String json = """ >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> """; >> >> versus >> >> String json = " >> { >> \"name\": \"Jean Smith\", >> \"age\": 32, >> \"location\": \"San Jose\" >> } >> "; >> >> This second case is where we wandered off the tracks with raw-ness. We assumed raw-ness is necessary to avoid all the backslashes. Most cases can be handled with triple double quotes. >> >> Okay, so why not more combinations? Simply because, most of the time they are not needed. On the rare occasion we do have nested triple double quotes, we can then use escape sequences. >> >> String nestedJSON = """ >> \"\"\" >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> \"\"\"; >> """; >> >> or better yet, you only have to escape every third double quote >> >> String nestedJSON = """ >> \""" >> { >> "name": "Jean Smith", >> "age": 32, >> "location": "San Jose" >> } >> \"""; >> """; >> >> Not so evil and it's familiar. >> >> >> >> >> Meaning, you can only use single quotes for simple strings and triple quotes for multi-line strings. I don't have a strong opinion other than it seems like an unneeded restriction. The only argument I've heard has been for better error recovery when missing a close delimiter during parsing. My counter for that argument is that if you are processing multi-line strings then you can easily track the first newline after the opening delimiter and recover from there. I implemented that recovery in javac and worked out well. >> >> >> >> >> >> Cooked (translated escape sequences) should be the default. Why should a multi-line string be different than a simple string? We have a solution for embedding double quote. Single quotes don't require escaping. Tabs and newlines can exist as is. Unicode characters can be either an escape sequence or the unicode character. So the only problem case is backslash. I would argue that the rare backslash can be escaped. If not, then the developer can use the raw-ness solution. >> >> >> >> >> If we don't translate newlines, then source is not transferable across platforms. That is, a source from one platform may not execute the same way on another platform. Translating consistently guarantees execution consistency. As a note, programming languages that didn't translate newlines in multi-line string literals typically regretted it later (Python.) >> >> >> >> >> With the original Raw String Literal proposal, there was concern about leading and trailing nested delimiters. If we default to cooked strings, then we use can use \". >> >> >> >> >> These questions have been answered numerous times and fall into the realm of library support. Same arguments as before, same outcome. >> >> >> To summarize the bold paths at this point; >> - multi-line strings are an extension of traditional simple strings >> - newlines in a string are no longer an error and the string can extend across several lines >> - error recovery can pick up at the first newline after the opening delimiter >> - multi-line strings process escape sequences (including unicode) in the same way as simple strings >> - multiple double quotes are handled with escape sequences >> - triple double quote delimiter is introduced to avoid escaping simple double quote sequences >> >> Generally, I think this is very much in the traditional Java spirit. >> >> >> Now, let's move on to the lesser but more interesting issue. As I stated above, raw-ness is a multi-headed beast. Raw-ness involves the turning off the translation of >> - escape sequences >> - unicode escapes >> - delimiter sequences >> - escape sequence prefix (backslash) >> - tabs and newlines (control characters in general) >> >> Sometimes we need all of the translations, sometimes few and sometimes none. In the multi-line discussion above, we see we don't need raw as much as we might have expected. Maybe for occasional backslashes, as in regex and Windows paths strings. >> >> >> >> >> >> The original Raw String Literal proposal suggested that raw-ness was a property of the whole string literal and thus we proposed an alternate delimiter syntax just to emphasize that fact. If we accept the bold path of multi-line discussion above, then alternate delimiter is out. This leaves prefixing as the best option to bless a string literal with raw-ness. >> >> At this point, I would like to suggest an alternate, maybe progressive way to think of raw-ness. Since the original proposal, I have been thinking of raw-ness as a state of processing the literal. State is certainly obvious in the scanner implementation, why not raise that to the language level? If it is a state then we should be able to enter and leave that state in some way. Escape sequences are an obvious way of transitioning translation in the string. \- and \+ are available and not currently recognized as valid escape sequences, why not \- and \+ to toggle escape processing? >> >> String a = "cooked \-raw\+ cooked"; // cooked raw cooked - a little odd but not so much so >> String b = "abc\-\\\\\+def"; // abc\\\\def - struggling >> String c = "\-abc\\\\def"; // abc\\\\def - more readable as an inner prefix >> String d = "abc\-\-def\+\+ghi"; // abc\-def\+ghi - raw on "\-" is "\" and "-", raw off "\+" is "\" and "+" >> String e = """\-"abc"\+"""; // "abc" - \- and \+ act a no-ops of sorts >> >> Comparing property vs state: >> >> Runtime.getRuntime().exec(R""" "C:\Program Files\foo" bar""".strip()); >> Runtime.getRuntime().exec("""\-"C:\Program Files\foo" bar"""); >> >> System.out.println("this".matches(R"\w\w\w\w")); >> System.out.println("this".matches("\-\w\w\w\w")); >> >> String html = R""" >> >> >>

Hello World.

>> >> >> """.align(); >> String html = """\- >> >> >>

Hello World.

>> >> >> """.align(); >> >> >> String nested = """ >> String EXAMPLE_TEST = "This is my small example " >> + "string which I'm going to " >> + "use for pattern matching."; >> """ + >> R""" >> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >> """; >> String nested = """ >> String EXAMPLE_TEST = "This is my small example " >> + "string which I'm going to " >> + "use for pattern matching."; >> \- >> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >> \+ >> """; >> >> Hopefully, this is a good starting point for discussion. As before, I'm pragmatic about which direction we go, so feel free to comment. >> >> Cheers, >> >> -- Jim >> >> >> >> >> >> >> >> >> >> From james.laskey at oracle.com Sun Jan 6 17:58:18 2019 From: james.laskey at oracle.com (James Laskey) Date: Sun, 6 Jan 2019 13:58:18 -0400 Subject: Enhancing Java String Literals Round 2 In-Reply-To: <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> Message-ID: <214C1B6D-2458-434A-A31F-C1A27885199C@oracle.com> The backslash prefix makes a lot of sense to me. Creating scenarios where I needed to toggle the raw-ness seemed forced. The only awkwardness I see is with leading/trailing quotes. """\"Cooked\"""" """ "Raw" """.strip() or """"Raw"""" Cooked is fine wth escapes. Raw could have a rule like; any quotes after/before the opening/closing TQ sequence get added to the string. ? Jim Sent from my iPhone > On Jan 6, 2019, at 1:43 PM, Brian Goetz wrote: > > As Reinier pointed out on amber-dev, regex strings may routinely contain escaped meta-characters ? +, *, brackets, etc. So the embedded \- and \+ story has an obvious conflict. While these are not the only possible characters for such ?shift? operators, his point that this might be overkill is a good one. So let?s look at options for denoting raw-ness. > > - Just make triple-quote strings always raw as well as multi-line-capable; regexes and friends would use TQ strings even though they are single line (Scala, Kotlin) > - Letter prefix, such as R??? (C++, Rust) > - Symbol prefix, such as @??? (C#), or \??? (suggestive of ?distributing? the escaping across the string.) > - Embedded escape sequence that switches to raw mode, but can?t be switched back: ?\+raw string?, ?\{raw}raw string?. > > Data from Google suggests that, in their code base, on the order of 5% of candidates for multi-line strings use some escape sequences (Kevin/Liam, can you verify?) This suggests to me that the ?just use TQ? approach is vaguely workable, but likely to be error-prone (5% is infrequently enough that people will say \t when they mean tab and discover this at runtime, and then have to go back and add a .escape() call.) > > (Of these, my current favorite is using the backslash: ?cooked?, ???cooked and ML-capable?, \?raw?, \???raw and ML capable?. The use of \ suggests ?the backslashes have been pre-added for you?, building on existing associations with backslash.) > > Are there other credible candidates that I?ve missed? > > > >> On Jan 2, 2019, at 2:00 PM, Jim Laskey wrote: >> >> >>> >>> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2/index.html >>> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2.pdf >>> First of all, I would like to apologize for leading us down the garden path re Java Raw String Literals. I jumped into this feature fully enamoured with the JavaScript equivalent and, "why can't we have this in Java?" As the proposal evolved, it became clear that what we came up with was not a good Java solution. I underestimated the concern that the original proposal was too left field and did not fit into Java very well. It's somewhat ironic that the backtick looks like a thorn. >>> >>> So, let's start the new year with a structured approach to the enhance string literal design. Brian gave a summary of why the old design fails. Starting with this summary, Brian and I talked out a series of critical decision points that should be given thought, if not answers, before we propose a new design. As an exercise, I supplemented these points and created a series of small decision trees (a full on decision tree would be complex and not very helpful.) I found these trees good intuition pumps for getting the design at least 80% there. Hopefully, this exercise will help you in the same way. >>> >>> >>> >>> >>> Even the label Raw String Literal put the emphasis on the wrong part of the feature. What developers really want is multi-line strings. They want to be able to paste alien source into their Java programs with as little fuss as possible. >>> >>> String raw-ness (not translating escapes) is a tangential aspect, that may or may not be needed to implement multi-line strings. Yes, the regex and Window's file path arguments in JEP 326 are still valid, but this aspect needs to be separated from the main part of the design. Further in the discussion, we'll see that raw-ness is really a many-headed hydra, best slain one head at a time. >>> >>> >>> >>> >>> We have to be honest. We know Java's primary market. Sure we want to embed Java in Java for writing tests. Sure there is JavaScript and CSS in web pages. Nevertheless, most uses of multi-line will be for non-complex grammars. Specifically, grammars that don't require special handling of multi-character delimiter sequences. If you can accept this, then the solution set is much smaller. >>> >>> >>> >>> >>> This is an easy one. Familiarity is key to feature education. Radical wandering off with new syntax is not helpful to anyone but bloggers and authors. >>> >>> >>> >>> >>> If you buy into the familiarity argument, then double quote is really only choice for a delimiter. Double quote already indicates a string literal. Single quote indicates a character. We don?t want to gratuitously burn unused symbols like backtick. Backslash works for regex but maybe not for others. Combinations and nonces just introduce new noise when our original goal was to reduce noise and complexity. >>> >>> >>> >>> >>> Other languages avoid delimiter escape sequences by doubling up. Example, "abc""def" -> abc"def. This concept is unfamiliar to Java developers, why change now. Escape sequences are what we know. >>> >>> >>> >>> >>> Language designers got very nervous when I suggested infinite delimiter sequences in the original proposal; lexically sacrilegious. I felt strongly that it was easy to explain and only 1 in 1M developers would ever use more than 4-5 character delimiter sequences. In round two, I have come to agree. This was taking on more complexity than is really warranted, for a use case that doesn?t come along very often. I suggest we only need single and triple double quotes. A single double quote works today, so no argument there. Double double quotes means empty string, no problem. Triple double quotes are only necessary to avoid having to escape quotes in alien source. >>> >>> String json = """ >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> """; >>> >>> versus >>> >>> String json = " >>> { >>> \"name\": \"Jean Smith\", >>> \"age\": 32, >>> \"location\": \"San Jose\" >>> } >>> "; >>> >>> This second case is where we wandered off the tracks with raw-ness. We assumed raw-ness is necessary to avoid all the backslashes. Most cases can be handled with triple double quotes. >>> >>> Okay, so why not more combinations? Simply because, most of the time they are not needed. On the rare occasion we do have nested triple double quotes, we can then use escape sequences. >>> >>> String nestedJSON = """ >>> \"\"\" >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> \"\"\"; >>> """; >>> >>> or better yet, you only have to escape every third double quote >>> >>> String nestedJSON = """ >>> \""" >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> \"""; >>> """; >>> >>> Not so evil and it's familiar. >>> >>> >>> >>> >>> Meaning, you can only use single quotes for simple strings and triple quotes for multi-line strings. I don't have a strong opinion other than it seems like an unneeded restriction. The only argument I've heard has been for better error recovery when missing a close delimiter during parsing. My counter for that argument is that if you are processing multi-line strings then you can easily track the first newline after the opening delimiter and recover from there. I implemented that recovery in javac and worked out well. >>> >>> >>> >>> >>> >>> Cooked (translated escape sequences) should be the default. Why should a multi-line string be different than a simple string? We have a solution for embedding double quote. Single quotes don't require escaping. Tabs and newlines can exist as is. Unicode characters can be either an escape sequence or the unicode character. So the only problem case is backslash. I would argue that the rare backslash can be escaped. If not, then the developer can use the raw-ness solution. >>> >>> >>> >>> >>> If we don't translate newlines, then source is not transferable across platforms. That is, a source from one platform may not execute the same way on another platform. Translating consistently guarantees execution consistency. As a note, programming languages that didn't translate newlines in multi-line string literals typically regretted it later (Python.) >>> >>> >>> >>> >>> With the original Raw String Literal proposal, there was concern about leading and trailing nested delimiters. If we default to cooked strings, then we use can use \". >>> >>> >>> >>> >>> These questions have been answered numerous times and fall into the realm of library support. Same arguments as before, same outcome. >>> >>> >>> To summarize the bold paths at this point; >>> - multi-line strings are an extension of traditional simple strings >>> - newlines in a string are no longer an error and the string can extend across several lines >>> - error recovery can pick up at the first newline after the opening delimiter >>> - multi-line strings process escape sequences (including unicode) in the same way as simple strings >>> - multiple double quotes are handled with escape sequences >>> - triple double quote delimiter is introduced to avoid escaping simple double quote sequences >>> >>> Generally, I think this is very much in the traditional Java spirit. >>> >>> >>> Now, let's move on to the lesser but more interesting issue. As I stated above, raw-ness is a multi-headed beast. Raw-ness involves the turning off the translation of >>> - escape sequences >>> - unicode escapes >>> - delimiter sequences >>> - escape sequence prefix (backslash) >>> - tabs and newlines (control characters in general) >>> >>> Sometimes we need all of the translations, sometimes few and sometimes none. In the multi-line discussion above, we see we don't need raw as much as we might have expected. Maybe for occasional backslashes, as in regex and Windows paths strings. >>> >>> >>> >>> >>> >>> The original Raw String Literal proposal suggested that raw-ness was a property of the whole string literal and thus we proposed an alternate delimiter syntax just to emphasize that fact. If we accept the bold path of multi-line discussion above, then alternate delimiter is out. This leaves prefixing as the best option to bless a string literal with raw-ness. >>> >>> At this point, I would like to suggest an alternate, maybe progressive way to think of raw-ness. Since the original proposal, I have been thinking of raw-ness as a state of processing the literal. State is certainly obvious in the scanner implementation, why not raise that to the language level? If it is a state then we should be able to enter and leave that state in some way. Escape sequences are an obvious way of transitioning translation in the string. \- and \+ are available and not currently recognized as valid escape sequences, why not \- and \+ to toggle escape processing? >>> >>> String a = "cooked \-raw\+ cooked"; // cooked raw cooked - a little odd but not so much so >>> String b = "abc\-\\\\\+def"; // abc\\\\def - struggling >>> String c = "\-abc\\\\def"; // abc\\\\def - more readable as an inner prefix >>> String d = "abc\-\-def\+\+ghi"; // abc\-def\+ghi - raw on "\-" is "\" and "-", raw off "\+" is "\" and "+" >>> String e = """\-"abc"\+"""; // "abc" - \- and \+ act a no-ops of sorts >>> >>> Comparing property vs state: >>> >>> Runtime.getRuntime().exec(R""" "C:\Program Files\foo" bar""".strip()); >>> Runtime.getRuntime().exec("""\-"C:\Program Files\foo" bar"""); >>> >>> System.out.println("this".matches(R"\w\w\w\w")); >>> System.out.println("this".matches("\-\w\w\w\w")); >>> >>> String html = R""" >>> >>> >>>

Hello World.

>>> >>> >>> """.align(); >>> String html = """\- >>> >>> >>>

Hello World.

>>> >>> >>> """.align(); >>> >>> >>> String nested = """ >>> String EXAMPLE_TEST = "This is my small example " >>> + "string which I'm going to " >>> + "use for pattern matching."; >>> """ + >>> R""" >>> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >>> """; >>> String nested = """ >>> String EXAMPLE_TEST = "This is my small example " >>> + "string which I'm going to " >>> + "use for pattern matching."; >>> \- >>> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >>> \+ >>> """; >>> >>> Hopefully, this is a good starting point for discussion. As before, I'm pragmatic about which direction we go, so feel free to comment. >>> >>> Cheers, >>> >>> -- Jim >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> > From forax at univ-mlv.fr Sun Jan 6 19:39:21 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 6 Jan 2019 20:39:21 +0100 (CET) Subject: Enhancing Java String Literals Round 2 In-Reply-To: <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> Message-ID: <1539687789.458143.1546803561849.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Jim Laskey" > Cc: "amber-spec-experts" > Envoy?: Dimanche 6 Janvier 2019 18:43:19 > Objet: Re: Enhancing Java String Literals Round 2 > As Reinier pointed out on amber-dev, regex strings may routinely contain escaped > meta-characters ? +, *, brackets, etc. So the embedded \- and \+ story has an > obvious conflict. While these are not the only possible characters for such > ?shift? operators, his point that this might be overkill is a good one. So > let?s look at options for denoting raw-ness. > > - Just make triple-quote strings always raw as well as multi-line-capable; > regexes and friends would use TQ strings even though they are single line > (Scala, Kotlin) > - Letter prefix, such as R??? (C++, Rust, Ruby) > - Symbol prefix, such as @??? (C#), or \??? (suggestive of ?distributing? the > escaping across the string.) > - Embedded escape sequence that switches to raw mode, but can?t be switched > back: ?\+raw string?, ?\{raw}raw string?. > > Data from Google suggests that, in their code base, on the order of 5% of > candidates for multi-line strings use some escape sequences (Kevin/Liam, can > you verify?) This suggests to me that the ?just use TQ? approach is vaguely > workable, but likely to be error-prone (5% is infrequently enough that people > will say \t when they mean tab and discover this at runtime, and then have to > go back and add a .escape() call.) > > (Of these, my current favorite is using the backslash: ?cooked?, ???cooked and > ML-capable?, \?raw?, \???raw and ML capable?. The use of \ suggests ?the > backslashes have been pre-added for you?, building on existing associations > with backslash.) > > Are there other credible candidates that I?ve missed? the triple single quote like in Ruby, '''...''' the fake method call like in Lisp or Perl, quote(...), q(...) or raw(...) R?mi > > > >> On Jan 2, 2019, at 2:00 PM, Jim Laskey wrote: >> >> >>> >>> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2/index.html >>> >>> http://cr.openjdk.java.net/~jlaskey/Strings/RTL2.pdf >>> >>> First of all, I would like to apologize for leading us down the garden path re >>> Java Raw String Literals. I jumped into this feature fully enamoured with the >>> JavaScript equivalent and, "why can't we have this in Java?" As the proposal >>> evolved, it became clear that what we came up with was not a good Java >>> solution. I underestimated the concern that the original proposal was too left >>> field and did not fit into Java very well. It's somewhat ironic that the >>> backtick looks like a thorn. >>> >>> So, let's start the new year with a structured approach to the enhance string >>> literal design. Brian gave a summary of why the old design fails. Starting with >>> this summary, Brian and I talked out a series of critical decision points that >>> should be given thought, if not answers, before we propose a new design. As an >>> exercise, I supplemented these points and created a series of small decision >>> trees (a full on decision tree would be complex and not very helpful.) I found >>> these trees good intuition pumps for getting the design at least 80% there. >>> Hopefully, this exercise will help you in the same way. >>> >>> >>> >>> >>> Even the label Raw String Literal put the emphasis on the wrong part of the >>> feature. What developers really want is multi-line strings. They want to be >>> able to paste alien source into their Java programs with as little fuss as >>> possible. >>> >>> String raw-ness (not translating escapes) is a tangential aspect, that may or >>> may not be needed to implement multi-line strings. Yes, the regex and Window's >>> file path arguments in JEP 326 are still valid, but this aspect needs to be >>> separated from the main part of the design. Further in the discussion, we'll >>> see that raw-ness is really a many-headed hydra, best slain one head at a time. >>> >>> >>> >>> >>> We have to be honest. We know Java's primary market. Sure we want to embed Java >>> in Java for writing tests. Sure there is JavaScript and CSS in web pages. >>> Nevertheless, most uses of multi-line will be for non-complex grammars. >>> Specifically, grammars that don't require special handling of multi-character >>> delimiter sequences. If you can accept this, then the solution set is much >>> smaller. >>> >>> >>> >>> >>> This is an easy one. Familiarity is key to feature education. Radical wandering >>> off with new syntax is not helpful to anyone but bloggers and authors. >>> >>> >>> >>> >>> If you buy into the familiarity argument, then double quote is really only >>> choice for a delimiter. Double quote already indicates a string literal. Single >>> quote indicates a character. We don?t want to gratuitously burn unused symbols >>> like backtick. Backslash works for regex but maybe not for others. Combinations >>> and nonces just introduce new noise when our original goal was to reduce noise >>> and complexity. >>> >>> >>> >>> >>> Other languages avoid delimiter escape sequences by doubling up. Example, >>> "abc""def" -> abc"def. This concept is unfamiliar to Java developers, why >>> change now. Escape sequences are what we know. >>> >>> >>> >>> >>> Language designers got very nervous when I suggested infinite delimiter >>> sequences in the original proposal; lexically sacrilegious. I felt strongly >>> that it was easy to explain and only 1 in 1M developers would ever use more >>> than 4-5 character delimiter sequences. In round two, I have come to agree. >>> This was taking on more complexity than is really warranted, for a use case >>> that doesn?t come along very often. I suggest we only need single and triple >>> double quotes. A single double quote works today, so no argument there. Double >>> double quotes means empty string, no problem. Triple double quotes are only >>> necessary to avoid having to escape quotes in alien source. >>> >>> String json = """ >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> """; >>> >>> versus >>> >>> String json = " >>> { >>> \"name\": \"Jean Smith\", >>> \"age\": 32, >>> \"location\": \"San Jose\" >>> } >>> "; >>> >>> This second case is where we wandered off the tracks with raw-ness. We assumed >>> raw-ness is necessary to avoid all the backslashes. Most cases can be handled >>> with triple double quotes. >>> >>> Okay, so why not more combinations? Simply because, most of the time they are >>> not needed. On the rare occasion we do have nested triple double quotes, we can >>> then use escape sequences. >>> >>> String nestedJSON = """ >>> \"\"\" >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> \"\"\"; >>> """; >>> >>> or better yet, you only have to escape every third double quote >>> >>> String nestedJSON = """ >>> \""" >>> { >>> "name": "Jean Smith", >>> "age": 32, >>> "location": "San Jose" >>> } >>> \"""; >>> """; >>> >>> Not so evil and it's familiar. >>> >>> >>> >>> >>> Meaning, you can only use single quotes for simple strings and triple quotes for >>> multi-line strings. I don't have a strong opinion other than it seems like an >>> unneeded restriction. The only argument I've heard has been for better error >>> recovery when missing a close delimiter during parsing. My counter for that >>> argument is that if you are processing multi-line strings then you can easily >>> track the first newline after the opening delimiter and recover from there. I >>> implemented that recovery in javac and worked out well. >>> >>> >>> >>> >>> >>> Cooked (translated escape sequences) should be the default. Why should a >>> multi-line string be different than a simple string? We have a solution for >>> embedding double quote. Single quotes don't require escaping. Tabs and newlines >>> can exist as is. Unicode characters can be either an escape sequence or the >>> unicode character. So the only problem case is backslash. I would argue that >>> the rare backslash can be escaped. If not, then the developer can use the >>> raw-ness solution. >>> >>> >>> >>> >>> If we don't translate newlines, then source is not transferable across >>> platforms. That is, a source from one platform may not execute the same way on >>> another platform. Translating consistently guarantees execution consistency. As >>> a note, programming languages that didn't translate newlines in multi-line >>> string literals typically regretted it later (Python.) >>> >>> >>> >>> >>> With the original Raw String Literal proposal, there was concern about leading >>> and trailing nested delimiters. If we default to cooked strings, then we use >>> can use \". >>> >>> >>> >>> >>> These questions have been answered numerous times and fall into the realm of >>> library support. Same arguments as before, same outcome. >>> >>> >>> To summarize the bold paths at this point; >>> - multi-line strings are an extension of traditional simple strings >>> - newlines in a string are no longer an error and the string can extend across >>> several lines >>> - error recovery can pick up at the first newline after the opening delimiter >>> - multi-line strings process escape sequences (including unicode) in the same >>> way as simple strings >>> - multiple double quotes are handled with escape sequences >>> - triple double quote delimiter is introduced to avoid escaping simple double >>> quote sequences >>> >>> Generally, I think this is very much in the traditional Java spirit. >>> >>> >>> Now, let's move on to the lesser but more interesting issue. As I stated above, >>> raw-ness is a multi-headed beast. Raw-ness involves the turning off the >>> translation of >>> - escape sequences >>> - unicode escapes >>> - delimiter sequences >>> - escape sequence prefix (backslash) >>> - tabs and newlines (control characters in general) >>> >>> Sometimes we need all of the translations, sometimes few and sometimes none. In >>> the multi-line discussion above, we see we don't need raw as much as we might >>> have expected. Maybe for occasional backslashes, as in regex and Windows paths >>> strings. >>> >>> >>> >>> >>> >>> The original Raw String Literal proposal suggested that raw-ness was a property >>> of the whole string literal and thus we proposed an alternate delimiter syntax >>> just to emphasize that fact. If we accept the bold path of multi-line >>> discussion above, then alternate delimiter is out. This leaves prefixing as the >>> best option to bless a string literal with raw-ness. >>> >>> At this point, I would like to suggest an alternate, maybe progressive way to >>> think of raw-ness. Since the original proposal, I have been thinking of >>> raw-ness as a state of processing the literal. State is certainly obvious in >>> the scanner implementation, why not raise that to the language level? If it is >>> a state then we should be able to enter and leave that state in some way. >>> Escape sequences are an obvious way of transitioning translation in the string. >>> \- and \+ are available and not currently recognized as valid escape sequences, >>> why not \- and \+ to toggle escape processing? >>> >>> String a = "cooked \-raw\+ cooked"; // cooked raw cooked - a little odd but not >>> so much so >>> String b = "abc\-\\\\\+def"; // abc\\\\def - struggling >>> String c = "\-abc\\\\def"; // abc\\\\def - more readable as an inner >>> prefix >>> String d = "abc\-\-def\+\+ghi"; // abc\-def\+ghi - raw on "\-" is "\" and >>> "-", raw off "\+" is "\" and "+" >>> String e = """\-"abc"\+"""; // "abc" - \- and \+ act a no-ops of sorts >>> >>> Comparing property vs state: >>> >>> Runtime.getRuntime().exec(R""" "C:\Program Files\foo" bar""".strip()); >>> Runtime.getRuntime().exec("""\-"C:\Program Files\foo" bar"""); >>> >>> System.out.println("this".matches(R"\w\w\w\w")); >>> System.out.println("this".matches("\-\w\w\w\w")); >>> >>> String html = R""" >>> >>> >>>

Hello World.

>>> >>> >>> """.align(); >>> String html = """\- >>> >>> >>>

Hello World.

>>> >>> >>> """.align(); >>> >>> >>> String nested = """ >>> String EXAMPLE_TEST = "This is my small example " >>> + "string which I'm going to " >>> + "use for pattern matching."; >>> """ + >>> R""" >>> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >>> """; >>> String nested = """ >>> String EXAMPLE_TEST = "This is my small example " >>> + "string which I'm going to " >>> + "use for pattern matching."; >>> \- >>> System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); >>> \+ >>> """; >>> >>> Hopefully, this is a good starting point for discussion. As before, I'm >>> pragmatic about which direction we go, so feel free to comment. >>> >>> Cheers, >>> >>> -- Jim >>> >>> >>> >>> >>> >>> >>> >>> >>> From guy.steele at oracle.com Mon Jan 7 20:58:26 2019 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 7 Jan 2019 15:58:26 -0500 Subject: Enhancing Java String Literals Round 2 In-Reply-To: <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> Message-ID: > On Jan 6, 2019, at 12:43 PM, Brian Goetz wrote: > . . . > (Of these, my current favorite is using the backslash: ?cooked?, ???cooked and ML-capable?, \?raw?, \???raw and ML capable?. The use of \ suggests ?the backslashes have been pre-added for you?, building on existing associations with backslash.) > > Are there other credible candidates that I?ve missed? I like the idea of cooked-ness and multiline-ness being orthogonal, and this proposal captures that neatly. But: Even though it may not be used often, I still worry about being able to support multiple levels of nesting and/or being able to incorporate ANY raw string. There have been comments pro and con about allowing any number of double-quotes, or any odd number of double quotes. (I briefly pondered a compromise that would allow the use of one, tree or five double quotes! But this morning I grew cold on it.) So here is a variation on your proposal. The possible cases are: "single line, may contain escapes" \"single line, no escapes, cannot contain a double quote" """multiline, may contain escapes""" \""?multiline, no escapes, cannot contain the nonce followed by two double quotes?"" where ??? represents a nonce string (the same string at each end), which has to be one of the following: * a single printable character (possibly further limit this choice?) that is not an encloser * a left encloser, a string of characters that could be a Java identifier, and a matching right encloser (possibly further limit the set of enclosers and/or identifier characters that may be used?) Note that the nonce may be a double quote character if desired. So actual examples are: \"""multiline, no escapes, cannot contain three consecutive double quotes""" \""/multiline, no escapes, cannot contain a slash followed by two double quotes/"" \""[HTML]multiline, no escapes, cannot contain left bracket, H, T, M, L, right bracket, double quote, double quote [HTML]"" So it is always possible to include a raw string literal within another raw string literal by choosing an appropriate nonce, and yet you don?t have to choose a weird nonce in the manifold situations where you don?t need one. One choiceis to decide that the only permitted choices for the nonce are a double quote or square brackets. No one is likely to be confused by seeing an even number (two) of double quotes into thinking they denote an empty string, because the initial ones are immediately preceded by ?\? and the final ones are immediately preceded by ?]?. Food for thought. From brian.goetz at oracle.com Mon Jan 7 22:13:07 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 7 Jan 2019 17:13:07 -0500 Subject: Enhancing Java String Literals Round 2 In-Reply-To: References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> Message-ID: <5C9CAF8A-826A-4E10-B717-927B922099BA@oracle.com> > \""?multiline, no escapes, cannot contain the nonce followed by two double quotes?"" > > where ??? represents a nonce string (the same string at each end), which has to be one of the following: This is an interesting middle ground. Off the top of my head, I would think we?d have to pretty seriously restrict the characters, so as to avoid parsing ambiguities: String s = ??; ? // is this an empty string, or using semicolon as a nonce? int i = ??.length() // am I invoking a method on the empty string, or using dot as a nonce? From guy.steele at oracle.com Mon Jan 7 21:58:22 2019 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 7 Jan 2019 16:58:22 -0500 Subject: Enhancing Java String Literals Round 2 In-Reply-To: <5C9CAF8A-826A-4E10-B717-927B922099BA@oracle.com> References: <2DC89318-2FDF-4E23-847B-061CDD04E0EC@oracle.com> <8E4055BD-8266-4C62-87CB-B2FC141619EC@oracle.com> <5C9CAF8A-826A-4E10-B717-927B922099BA@oracle.com> Message-ID: <550D07FE-E15A-48C7-A7A5-9445A5625AAB@oracle.com> > On Jan 7, 2019, at 5:13 PM, Brian Goetz wrote: > > >> \""?multiline, no escapes, cannot contain the nonce followed by two double quotes?"" >> >> where ??? represents a nonce string (the same string at each end), which has to be one of the following: > > This is an interesting middle ground. > > Off the top of my head, I would think we?d have to pretty seriously restrict the characters, so as to avoid parsing ambiguities: > > String s = ??; ? // is this an empty string, or using semicolon as a nonce? It?s an empty string, because only \"" is followed by a nonce. > int i = ??.length() // am I invoking a method on the empty string, or using dot as a nonce? The dot is not a nonce, because only \"" is followed by a nonce. From brian.goetz at oracle.com Tue Jan 8 15:22:17 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 8 Jan 2019 10:22:17 -0500 Subject: We need more keywords, captain! Message-ID: This document proposes a possible move that will buy us some breathing room in the perpetual problem where the keyword-management tail wags the programming-model dog. ## We need more keywords, captain! Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to be used as identifiers.? This set has remained quite stable over the years (for good reason), with the exceptions of `assert` added in 1.4, `enum` added in 5, and `_` added in 9.? In addition, there are also several _reserved identifiers_ (`true`, `false`, and `null`) which behave almost like keywords. Over time, as the language evolves, language designers face a challenge; the set of keywords imagined in version 1.0 are rarely suitable for expressing all the things we might ever want our language to express.? We have several tools at our disposal for addressing this problem: ?- Eminent domain.? Take words that were previously identifiers, and ?? turn them into keywords, as we did with `assert` in 1.4. ?- Recycle.? Repurpose an existing keyword for something that it was ?? never really meant for (such as using `default` for annotation ?? values or default methods). ?- Do without.? Find a way to pick a syntax that doesn't require a ?? new keyword, such as using `@interface` for annotations instead of ?? `annotation` -- or don't do the feature at all. ?- Smoke and mirrors.? Create the illusion of context-dependent ?? keywords through various linguistic heroics (restricted keywords, ?? reserved type names.) In any given situation, all of these options are on the table -- but most of the time, none of these options are very good.? The lack of reasonable options for extending the syntax of the language threatens to become a significant impediment to language evolution. #### Why not "just" make new keywords? While it may be legal for us to declare `i` to be a keyword in a future version of Java, this would likely break every program in the world,? since `i` is used so commonly as an identifier.? (When the `assert` keyword was added in 1.4, it broke every testing framework.) The cost of remediating the effect of such incompatible changes varies as well; invalidating a name choice for a local variable has a local fix,? but invalidating the name of a public type or an interface method might well be fatal. Additionally, the keywords we're likely to want to reclaim are often those that are popular as identifiers (e.g., `value`, `var`, `method`), making such fatal collisions more likely.? In some cases, if the keyword candidate in question is sufficiently rarely used as an identifier, we might still opt to take that source-compatibility hit -- but names that are less likely to collide (e.g., `usually_but_not_always_final`) are likely not the ones we want in our language. Realistically, this is unlikely to be a well we can go to very often, and the bar must be very high. #### Why not "just" live with the keywords we have? Reusing keywords in multiple contexts has ample precedent in programming languages, including Java.? (For example, we (ab)use `final` for "not mutable", "not overridable", and "not extensible".) Sometimes, using an existing keyword in a new context is natural and sensible, but usually it's not our first choice.? Over time, as the range of demands we place on our keyword set expands, this may well descend into the ridiculous; no one wants to use `null final` as a way of negating finality.? (While one might think such things are too ridiculous to consider, note that we received serious-seeming suggestions during JEP 325 to use `new switch` to describe a switch with different semantics.? Presumably to be followed by `new new switch` in ten years.) Of course, one way to live without making new keywords is to stop evolving the language entirely.? While there are some who think this is a fine idea, doing so because of the lack of available tokens would be a silly reason. We are convinced that Java has a long life ahead of it, and developers are excited about new features that enable to them to write more expressive and reliable code. #### Why not "just" make contextual keywords? At first glance, contextual keywords (and their friends, such as reserved type identifiers) may appear to be a magic wand; they let us create the illusion of adding new keywords without breaking existing programs.? But the positive track record of contextual keywords hides a great deal of complexity and distortion. Each grammar position is its own story; contextual keywords that might be used as modifiers (e.g., `readonly`) have different ambiguity considerations than those that might be use in code (e.g., a `matches` expression).? The process of selecting a contextual keyword is not a simple matter of adding it to the grammar; each one requires an analysis of potential current and future interactions.? Similarly, each token we try to repurpose may have its own special considerations;? for example, we could justify the use of `var` as a reserved type name? because because the naming conventions are so broadly adhered to.? Finally, the use of contextual keywords in certain? syntactic positions can create additional considerations for extending the syntax later. Contextual keywords create complexity for specifications, compilers, and IDEs.? With one or two special cases, we can often deal well enough, but if special cases were to become more pervasive, this would likely result in more significant maintenance costs or bug tail. While it is easy to dismiss this as ?not my problem?, in reality, this is everybody?s problem. IDEs often have to guess whether a use of a contextual keyword is a keyword or identifier, and it may not have enough information to make a good guess until it?s seen more input. This results in worse user highlighting, auto-completion, and refactoring abilities ? or worse.? These problems quickly become everyone's problems. So, while contextual keywords are one of the tools in our toolbox, they should also be used sparingly. #### Why is this a problem? Aside from the obvious consequences of these problems (clunky syntax, complexity, bugs), there is a more insidious hidden cost -- distortion.? The accidental details of keyword management pose a constant risk of distortion in language design. One could consider the choice to use `@interface` instead of `annotation` for annotations to be a distortion; having a descriptive name rather than a funky combination of punctuation and keyword would surely have made it easier for people to become familiar with annotations. In another example, the set of modifiers (`public`, `private`, `static`, `final`, etc) is not complete; there is no way to say ?not final? or ?not static?. This, in turn, means that we cannot create features where variables or classes are `final` by default, or members are `static` by default, because there?s no way to denote the desire to opt out of it.? While there may be reasons to justify a locally suboptimal default anyway (such as global consistency), we want to make these choices deliberately, not have them made for us by the accidental details of keyword management. Choosing to leave out a feature for reasons of simplicity is fine; leaving it out because we don't have a way to denote the obvious semantics is not. It may not be obvious from the outside, but this is a constant problem in evolving the language, and an ongoing tax that we all pay, directly or indirectly. ## We need a new source of keyword candidates Every time we confront this problem, the overwhelming tendency is to punt and pick one of the bad options, because the problem only comes along every once in a while.? But, with the features in the pipeline, I expect it will continue to come along with some frequency, and I?d rather get ahead of it. Given that all of these current options are problematic, and there is not even a least-problematic move that applies across all situations, my inclination is to try to expand the set of lexical forms that can be used as keywords. As a not-serious example, take the convention that we?ve used for experimental features, where we prefix provisional keywords in prototypes with two underscores, as we did with `__ByValue` in the Valhalla prototype. (We commonly do this in feature proposals and prototypes, mostly to signify ?this keyword is a placeholder for a syntax decision to be made later?, but also because it permits a simple implementation that is unlikely to collide with existing code.) We could, for example, carve out the space of identifiers that begin with underscore as being reserved for keywords. Of course, this isn?t so pretty, and it also means we'd have a mix of underscore and non-underscore keywords, so it?s not a serious suggestion, as much as an example of the sort of move we are looking for. But I do have a serious suggestion: allow _hyphenated_ keywords where one or more of the terms are already keywords or reserved identifiers. Unlike restricted keywords, this creates much less trouble for parsing, as (for example) `non-null` cannot be confused for a subtraction expression, and the lexer can always tell with fixed lookahead whether `a-b` is three tokens or one. This gives us a lot more room for creating new, less-conflicting keywords. And these new keywords are likely to be good names, too, as many of the missing concepts we want to add describe their relationship to existing language constructs -- such as `non-null`. Here?s some examples where this approach might yield credible candidates. (Note: none of these are being proposed here; this is merely an illustrative list of examples of how this mechanism could form keywords that might, in some particular possible future, be useful and better than the alternatives we have now.) ? - `non-null` ? - `non-final` ? - `package-private` (the default accessibility for class members, currently not denotable) ? - `public-read` (publicly readable, privately writable) ? - `null-checked` ? - `type-static` (a concept needed in Valhalla, which is static relative to a particular specialization of a class, rather than the class itself) ? - `default-value` ? - `eventually-final` (what the `@Stable` annotation currently suggests) ? - `semi-final` (an alternative to `sealed`) ? - `exhaustive-switch` (opting into exhaustiveness checking for statement ??? switches) ? - `enum-class`, `annotation-class`, `record-class` (we might have chosen these ???? as an alternative to `enum` and `@interface`, had we had the option) ? - `this-class` (to describe the class literal for the current class) ? - `this-return` (a common request is a way to mark a setter or builder method ??? as returning its receiver) (Again, the point is not to debate the merits of any of these specific examples; the point is merely to illustrate what we might be able to do with such a mechanism.) Having this as an option doesn't mean we can't also use the other approaches when they are suitable; it just means we have more, and likely less fraught, options with which to make better decisions. There are likely to be other lexical schemes by which new keywords can be created without impinging on existing code; this one seems credible and reasonably parsable by both machines and humans. #### "But that's ugly" Invariably, some percentage of readers will have an immediate and visceral reaction to this idea.? Let's stipulate for the record that some people will find this ugly.? (At least, at first.? Many such reactions are possibly-transient (see what I did there?) responses to unfamiliarity.) From brian.goetz at oracle.com Tue Jan 8 17:35:48 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 8 Jan 2019 12:35:48 -0500 Subject: We need more keywords, captain! In-Reply-To: References: Message-ID: When discussing this today at our compiler meeting, we realized a few more places where the lack of keywords produce distortions we don't even notice.? In expression switch, we settled on `break value` as the way to provide a value for a switch expression when the shorthand (`case L -> e`) doesn't suffice, but this was painful for everyone.? It's painful for users because there's now work required to disambiguate whether `break foo` is a labeled break or a value break; it was even more painful to specify, because a new form of abrupt completion had to be threaded through the spec. Being able to call this something like `break-with v` (or some other derived keyword) would have made this all a lot simpler. (BTW, we can still do this, since expression-switch is still in preview.) Moral of the story: even just a few minutes of brainstorming led us to several applications of this approach that we hadn't seen a few days ago. On 1/8/2019 10:22 AM, Brian Goetz wrote: > This document proposes a possible move that will buy us some breathing > room in the perpetual problem where the keyword-management tail wags > the programming-model dog. > > > ## We need more keywords, captain! > > Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to > be used as identifiers.? This set has remained quite stable over the > years (for good reason), with the exceptions of `assert` added in 1.4, > `enum` added in 5, and `_` added in 9.? In addition, there are also > several _reserved identifiers_ (`true`, `false`, and `null`) which > behave almost like keywords. > > Over time, as the language evolves, language designers face a > challenge; the set of keywords imagined in version 1.0 are rarely > suitable for expressing all the things we might ever want our language > to express.? We have several tools at our disposal for addressing this > problem: > > ?- Eminent domain.? Take words that were previously identifiers, and > ?? turn them into keywords, as we did with `assert` in 1.4. > > ?- Recycle.? Repurpose an existing keyword for something that it was > ?? never really meant for (such as using `default` for annotation > ?? values or default methods). > > ?- Do without.? Find a way to pick a syntax that doesn't require a > ?? new keyword, such as using `@interface` for annotations instead of > ?? `annotation` -- or don't do the feature at all. > > ?- Smoke and mirrors.? Create the illusion of context-dependent > ?? keywords through various linguistic heroics (restricted keywords, > ?? reserved type names.) > > In any given situation, all of these options are on the table -- but > most of the time, none of these options are very good.? The lack of > reasonable options for extending the syntax of the language threatens > to become a significant impediment to language evolution. > > #### Why not "just" make new keywords? > > While it may be legal for us to declare `i` to be a keyword in a > future version of Java, this would likely break every program in the > world,? since `i` is used so commonly as an identifier.? (When the > `assert` keyword was added in 1.4, it broke every testing framework.) > The cost of remediating the effect of such incompatible changes varies > as well; invalidating a name choice for a local variable has a local > fix,? but invalidating the name of a public type or an interface > method might well be fatal. > > Additionally, the keywords we're likely to want to reclaim are often > those that are popular as identifiers (e.g., `value`, `var`, > `method`), making such fatal collisions more likely.? In some cases, > if the keyword candidate in question is sufficiently rarely used as an > identifier, we might still opt to take that source-compatibility hit > -- but names that are less likely to collide (e.g., > `usually_but_not_always_final`) are likely not the ones we want in our > language. Realistically, this is unlikely to be a well we can go to > very often, and the bar must be very high. > > #### Why not "just" live with the keywords we have? > > Reusing keywords in multiple contexts has ample precedent in > programming languages, including Java.? (For example, we (ab)use `final` > for "not mutable", "not overridable", and "not extensible".) > Sometimes, using an existing keyword in a new context is natural and > sensible, but usually it's not our first choice.? Over time, as the > range of demands we place on our keyword set expands, this may well > descend into the ridiculous; no one wants to use `null final` as a way > of negating finality.? (While one might think such things are too > ridiculous to consider, note that we received serious-seeming > suggestions during JEP 325 to use `new switch` to describe a switch > with different semantics.? Presumably to be followed by `new new > switch` in ten years.) > > Of course, one way to live without making new keywords is to stop > evolving the language entirely.? While there are some who think this > is a fine idea, doing so because of the lack of available tokens would > be a silly reason. We are convinced that Java has a long life ahead of > it, and developers are excited about new features that enable to them > to write more expressive and reliable code. > > #### Why not "just" make contextual keywords? > > At first glance, contextual keywords (and their friends, such as > reserved type identifiers) may appear to be a magic wand; they let us > create the illusion of adding new keywords without breaking existing > programs.? But the positive track record of contextual keywords hides > a great deal of complexity and distortion. > > Each grammar position is its own story; contextual keywords that might > be used as modifiers (e.g., `readonly`) have different ambiguity > considerations than those that might be use in code (e.g., a `matches` > expression).? The process of selecting a contextual keyword is not a > simple matter of adding it to the grammar; each one requires an > analysis of potential current and future interactions.? Similarly, > each token we try to repurpose may have its own special > considerations;? for example, we could justify the use of `var` as a > reserved type name? because because the naming conventions are so > broadly adhered to.? Finally, the use of contextual keywords in > certain? syntactic positions can create additional considerations for > extending the syntax later. > > Contextual keywords create complexity for specifications, compilers, > and IDEs.? With one or two special cases, we can often deal well > enough, but if special cases were to become more pervasive, this would > likely result in more significant maintenance costs or bug tail. While > it is easy to dismiss this as ?not my problem?, in reality, this is > everybody?s problem. IDEs often have to guess whether a use of a > contextual keyword is a keyword or identifier, and it may not have > enough information to make a good guess until it?s seen more input. > This results in worse user highlighting, auto-completion, and > refactoring abilities ? or worse.? These problems quickly become > everyone's problems. > > So, while contextual keywords are one of the tools in our toolbox, > they should also be used sparingly. > > #### Why is this a problem? > > Aside from the obvious consequences of these problems (clunky syntax, > complexity, bugs), there is a more insidious hidden cost -- > distortion.? The accidental details of keyword management pose a > constant risk of distortion in language design. > > One could consider the choice to use `@interface` instead of > `annotation` for annotations to be a distortion; having a descriptive > name rather than a funky combination of punctuation and keyword would > surely have made it easier for people to become familiar with > annotations. > > In another example, the set of modifiers (`public`, `private`, > `static`, `final`, etc) is not complete; there is no way to say ?not > final? or ?not static?. This, in turn, means that we cannot create > features where variables or classes are `final` by default, or members > are `static` by default, because there?s no way to denote the desire > to opt out of it.? While there may be reasons to justify a locally > suboptimal default anyway (such as global consistency), we want to > make these choices deliberately, not have them made for us by the > accidental details of keyword management. Choosing to leave out a > feature for reasons of simplicity is fine; leaving it out because we > don't have a way to denote the obvious semantics is not. > > It may not be obvious from the outside, but this is a constant problem > in evolving the language, and an ongoing tax that we all pay, directly > or indirectly. > > ## We need a new source of keyword candidates > > Every time we confront this problem, the overwhelming tendency is to > punt and pick one of the bad options, because the problem only comes > along every once in a while.? But, with the features in the pipeline, I > expect it will continue to come along with some frequency, and I?d > rather get ahead of it. Given that all of these current options are > problematic, and there is not even a least-problematic move that > applies across all situations, my inclination is to try to expand the > set of lexical forms that can be used as keywords. > > As a not-serious example, take the convention that we?ve used for > experimental features, where we prefix provisional keywords in > prototypes with two underscores, as we did with `__ByValue` in the > Valhalla prototype. (We commonly do this in feature proposals and > prototypes, mostly to signify ?this keyword is a placeholder for a > syntax decision to be made later?, but also because it permits a > simple implementation that is unlikely to collide with existing code.) > We could, for example, carve out the space of identifiers that begin > with underscore as being reserved for keywords. Of course, this isn?t > so pretty, and it also means we'd have a mix of underscore and > non-underscore keywords, so it?s not a serious suggestion, as much as > an example of the sort of move we are looking for. > > But I do have a serious suggestion: allow _hyphenated_ keywords where > one or more of the terms are already keywords or reserved identifiers. > Unlike restricted keywords, this creates much less trouble for > parsing, as (for example) `non-null` cannot be confused for a > subtraction expression, and the lexer can always tell with fixed > lookahead whether `a-b` is three tokens or one. This gives us a lot > more room for creating new, less-conflicting keywords. And these new > keywords are likely to be good names, too, as many of the missing > concepts we want to add describe their relationship to existing > language constructs -- such as `non-null`. > > Here?s some examples where this approach might yield credible > candidates. (Note: none of these are being proposed here; this is > merely an illustrative list of examples of how this mechanism could > form keywords that might, in some particular possible future, be > useful and better than the alternatives we have now.) > > ? - `non-null` > ? - `non-final` > ? - `package-private` (the default accessibility for class members, > currently not denotable) > ? - `public-read` (publicly readable, privately writable) > ? - `null-checked` > ? - `type-static` (a concept needed in Valhalla, which is static > relative to a particular specialization of a class, rather than the > class itself) > ? - `default-value` > ? - `eventually-final` (what the `@Stable` annotation currently suggests) > ? - `semi-final` (an alternative to `sealed`) > ? - `exhaustive-switch` (opting into exhaustiveness checking for > statement > ??? switches) > ? - `enum-class`, `annotation-class`, `record-class` (we might have > chosen these > ???? as an alternative to `enum` and `@interface`, had we had the option) > ? - `this-class` (to describe the class literal for the current class) > ? - `this-return` (a common request is a way to mark a setter or > builder method > ??? as returning its receiver) > > (Again, the point is not to debate the merits of any of these specific > examples; the point is merely to illustrate what we might be able to do > with such a mechanism.) > > Having this as an option doesn't mean we can't also use the other > approaches when they are suitable; it just means we have more, and > likely less fraught, options with which to make better decisions. > > There are likely to be other lexical schemes by which new keywords can > be created without impinging on existing code; this one seems credible > and reasonably parsable by both machines and humans. > > #### "But that's ugly" > > Invariably, some percentage of readers will have an immediate and > visceral reaction to this idea.? Let's stipulate for the record that > some people will find this ugly.? (At least, at first.? Many such > reactions are possibly-transient (see what I did there?) responses > to unfamiliarity.) > > From guy.steele at oracle.com Tue Jan 8 17:23:36 2019 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 8 Jan 2019 12:23:36 -0500 Subject: We need more keywords, captain! In-Reply-To: References: Message-ID: <86DCF84D-5C37-47C5-91BA-93385206FD49@oracle.com> Actually, even better than `break-with` would be `break-return`. It?s clearly a kind of `break`, and also clearly a kind of `return`. I think maybe this application alone has won me over to the idea of hyphenated keywords. (Then again, for this specific application we don?t even need the hyphen; we could just write `break return v;`.) ?Guy > On Jan 8, 2019, at 12:35 PM, Brian Goetz wrote: > > When discussing this today at our compiler meeting, we realized a few more places where the lack of keywords produce distortions we don't even notice. In expression switch, we settled on `break value` as the way to provide a value for a switch expression when the shorthand (`case L -> e`) doesn't suffice, but this was painful for everyone. It's painful for users because there's now work required to disambiguate whether `break foo` is a labeled break or a value break; it was even more painful to specify, because a new form of abrupt completion had to be threaded through the spec. > > Being able to call this something like `break-with v` (or some other derived keyword) would have made this all a lot simpler. (BTW, we can still do this, since expression-switch is still in preview.) > > Moral of the story: even just a few minutes of brainstorming led us to several applications of this approach that we hadn't seen a few days ago. > > On 1/8/2019 10:22 AM, Brian Goetz wrote: >> This document proposes a possible move that will buy us some breathing room in the perpetual problem where the keyword-management tail wags the programming-model dog. >> >> >> ## We need more keywords, captain! >> >> Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to >> be used as identifiers. This set has remained quite stable over the >> years (for good reason), with the exceptions of `assert` added in 1.4, >> `enum` added in 5, and `_` added in 9. In addition, there are also >> several _reserved identifiers_ (`true`, `false`, and `null`) which >> behave almost like keywords. >> >> Over time, as the language evolves, language designers face a >> challenge; the set of keywords imagined in version 1.0 are rarely >> suitable for expressing all the things we might ever want our language >> to express. We have several tools at our disposal for addressing this >> problem: >> >> - Eminent domain. Take words that were previously identifiers, and >> turn them into keywords, as we did with `assert` in 1.4. >> >> - Recycle. Repurpose an existing keyword for something that it was >> never really meant for (such as using `default` for annotation >> values or default methods). >> >> - Do without. Find a way to pick a syntax that doesn't require a >> new keyword, such as using `@interface` for annotations instead of >> `annotation` -- or don't do the feature at all. >> >> - Smoke and mirrors. Create the illusion of context-dependent >> keywords through various linguistic heroics (restricted keywords, >> reserved type names.) >> >> In any given situation, all of these options are on the table -- but >> most of the time, none of these options are very good. The lack of >> reasonable options for extending the syntax of the language threatens >> to become a significant impediment to language evolution. >> >> #### Why not "just" make new keywords? >> >> While it may be legal for us to declare `i` to be a keyword in a >> future version of Java, this would likely break every program in the >> world, since `i` is used so commonly as an identifier. (When the >> `assert` keyword was added in 1.4, it broke every testing framework.) >> The cost of remediating the effect of such incompatible changes varies >> as well; invalidating a name choice for a local variable has a local >> fix, but invalidating the name of a public type or an interface >> method might well be fatal. >> >> Additionally, the keywords we're likely to want to reclaim are often >> those that are popular as identifiers (e.g., `value`, `var`, >> `method`), making such fatal collisions more likely. In some cases, >> if the keyword candidate in question is sufficiently rarely used as an >> identifier, we might still opt to take that source-compatibility hit >> -- but names that are less likely to collide (e.g., >> `usually_but_not_always_final`) are likely not the ones we want in our >> language. Realistically, this is unlikely to be a well we can go to >> very often, and the bar must be very high. >> >> #### Why not "just" live with the keywords we have? >> >> Reusing keywords in multiple contexts has ample precedent in >> programming languages, including Java. (For example, we (ab)use `final` >> for "not mutable", "not overridable", and "not extensible".) >> Sometimes, using an existing keyword in a new context is natural and >> sensible, but usually it's not our first choice. Over time, as the >> range of demands we place on our keyword set expands, this may well >> descend into the ridiculous; no one wants to use `null final` as a way >> of negating finality. (While one might think such things are too >> ridiculous to consider, note that we received serious-seeming >> suggestions during JEP 325 to use `new switch` to describe a switch >> with different semantics. Presumably to be followed by `new new >> switch` in ten years.) >> >> Of course, one way to live without making new keywords is to stop >> evolving the language entirely. While there are some who think this >> is a fine idea, doing so because of the lack of available tokens would >> be a silly reason. We are convinced that Java has a long life ahead of >> it, and developers are excited about new features that enable to them >> to write more expressive and reliable code. >> >> #### Why not "just" make contextual keywords? >> >> At first glance, contextual keywords (and their friends, such as >> reserved type identifiers) may appear to be a magic wand; they let us >> create the illusion of adding new keywords without breaking existing >> programs. But the positive track record of contextual keywords hides >> a great deal of complexity and distortion. >> >> Each grammar position is its own story; contextual keywords that might >> be used as modifiers (e.g., `readonly`) have different ambiguity >> considerations than those that might be use in code (e.g., a `matches` >> expression). The process of selecting a contextual keyword is not a >> simple matter of adding it to the grammar; each one requires an >> analysis of potential current and future interactions. Similarly, >> each token we try to repurpose may have its own special >> considerations; for example, we could justify the use of `var` as a >> reserved type name because because the naming conventions are so >> broadly adhered to. Finally, the use of contextual keywords in >> certain syntactic positions can create additional considerations for >> extending the syntax later. >> >> Contextual keywords create complexity for specifications, compilers, >> and IDEs. With one or two special cases, we can often deal well >> enough, but if special cases were to become more pervasive, this would >> likely result in more significant maintenance costs or bug tail. While >> it is easy to dismiss this as ?not my problem?, in reality, this is >> everybody?s problem. IDEs often have to guess whether a use of a >> contextual keyword is a keyword or identifier, and it may not have >> enough information to make a good guess until it?s seen more input. >> This results in worse user highlighting, auto-completion, and >> refactoring abilities ? or worse. These problems quickly become >> everyone's problems. >> >> So, while contextual keywords are one of the tools in our toolbox, >> they should also be used sparingly. >> >> #### Why is this a problem? >> >> Aside from the obvious consequences of these problems (clunky syntax, >> complexity, bugs), there is a more insidious hidden cost -- >> distortion. The accidental details of keyword management pose a >> constant risk of distortion in language design. >> >> One could consider the choice to use `@interface` instead of >> `annotation` for annotations to be a distortion; having a descriptive >> name rather than a funky combination of punctuation and keyword would >> surely have made it easier for people to become familiar with >> annotations. >> >> In another example, the set of modifiers (`public`, `private`, >> `static`, `final`, etc) is not complete; there is no way to say ?not >> final? or ?not static?. This, in turn, means that we cannot create >> features where variables or classes are `final` by default, or members >> are `static` by default, because there?s no way to denote the desire >> to opt out of it. While there may be reasons to justify a locally >> suboptimal default anyway (such as global consistency), we want to >> make these choices deliberately, not have them made for us by the >> accidental details of keyword management. Choosing to leave out a >> feature for reasons of simplicity is fine; leaving it out because we >> don't have a way to denote the obvious semantics is not. >> >> It may not be obvious from the outside, but this is a constant problem >> in evolving the language, and an ongoing tax that we all pay, directly >> or indirectly. >> >> ## We need a new source of keyword candidates >> >> Every time we confront this problem, the overwhelming tendency is to >> punt and pick one of the bad options, because the problem only comes >> along every once in a while. But, with the features in the pipeline, I >> expect it will continue to come along with some frequency, and I?d >> rather get ahead of it. Given that all of these current options are >> problematic, and there is not even a least-problematic move that >> applies across all situations, my inclination is to try to expand the >> set of lexical forms that can be used as keywords. >> >> As a not-serious example, take the convention that we?ve used for >> experimental features, where we prefix provisional keywords in >> prototypes with two underscores, as we did with `__ByValue` in the >> Valhalla prototype. (We commonly do this in feature proposals and >> prototypes, mostly to signify ?this keyword is a placeholder for a >> syntax decision to be made later?, but also because it permits a >> simple implementation that is unlikely to collide with existing code.) >> We could, for example, carve out the space of identifiers that begin >> with underscore as being reserved for keywords. Of course, this isn?t >> so pretty, and it also means we'd have a mix of underscore and >> non-underscore keywords, so it?s not a serious suggestion, as much as >> an example of the sort of move we are looking for. >> >> But I do have a serious suggestion: allow _hyphenated_ keywords where >> one or more of the terms are already keywords or reserved identifiers. >> Unlike restricted keywords, this creates much less trouble for >> parsing, as (for example) `non-null` cannot be confused for a >> subtraction expression, and the lexer can always tell with fixed >> lookahead whether `a-b` is three tokens or one. This gives us a lot >> more room for creating new, less-conflicting keywords. And these new >> keywords are likely to be good names, too, as many of the missing >> concepts we want to add describe their relationship to existing >> language constructs -- such as `non-null`. >> >> Here?s some examples where this approach might yield credible >> candidates. (Note: none of these are being proposed here; this is >> merely an illustrative list of examples of how this mechanism could >> form keywords that might, in some particular possible future, be >> useful and better than the alternatives we have now.) >> >> - `non-null` >> - `non-final` >> - `package-private` (the default accessibility for class members, currently not denotable) >> - `public-read` (publicly readable, privately writable) >> - `null-checked` >> - `type-static` (a concept needed in Valhalla, which is static relative to a particular specialization of a class, rather than the class itself) >> - `default-value` >> - `eventually-final` (what the `@Stable` annotation currently suggests) >> - `semi-final` (an alternative to `sealed`) >> - `exhaustive-switch` (opting into exhaustiveness checking for statement >> switches) >> - `enum-class`, `annotation-class`, `record-class` (we might have chosen these >> as an alternative to `enum` and `@interface`, had we had the option) >> - `this-class` (to describe the class literal for the current class) >> - `this-return` (a common request is a way to mark a setter or builder method >> as returning its receiver) >> >> (Again, the point is not to debate the merits of any of these specific >> examples; the point is merely to illustrate what we might be able to do >> with such a mechanism.) >> >> Having this as an option doesn't mean we can't also use the other >> approaches when they are suitable; it just means we have more, and >> likely less fraught, options with which to make better decisions. >> >> There are likely to be other lexical schemes by which new keywords can >> be created without impinging on existing code; this one seems credible >> and reasonably parsable by both machines and humans. >> >> #### "But that's ugly" >> >> Invariably, some percentage of readers will have an immediate and >> visceral reaction to this idea. Let's stipulate for the record that >> some people will find this ugly. (At least, at first. Many such >> reactions are possibly-transient (see what I did there?) responses >> to unfamiliarity.) >> >> > From brian.goetz at oracle.com Tue Jan 8 18:52:09 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 8 Jan 2019 13:52:09 -0500 Subject: Sealed types In-Reply-To: References: <1643c042-6eb9-3ff9-955b-a14152ebfd27@oracle.com> Message-ID: <62ee111a-039c-98af-02ea-66bb42401afa@oracle.com> > > Or, if not additive, but we end up reusing the `final` keyword in the > way shown at the bottom of this email, then we could at least allow > `permits //, TypeA, TypeB` which is maybe nearly as good. In light of this morning's observation about hyphenated keywords ... there's a lot in this thread about why it seemed more attractive to retcon final for sealed-ness rather than create a new conditional keyword.? But it was a bit confusing (because final already has associations), and we ran into the problem that ??? final class X { } already means something, which deprived us of the opportunity to infer a permits clause.? Switching to something derived from `final` (such as `semi-final`) restores that, while keeping the associations with final: ??? semi-final class A ??????? permits X, Y, Z { ... } ??? non-final class X extends A { ... } ??? class Y extends A { }? // implicitly final ??? semi-final class B ??????? /* inferred permits clause */ { ... } I think this is the best of both worlds; we clearly connect to finality, but don't directly overload it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Jan 8 22:55:19 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 8 Jan 2019 14:55:19 -0800 Subject: Flow scoping In-Reply-To: References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> Message-ID: On Jan 4, 2019, at 6:07 AM, Tagir Valeev wrote: > > For the record: I heavily support this. If then-branch cannot complete normally, then unwrapping the else-branch should preserve the program semantics. It works today, and it should work in future Java as well. I agree also. But it is uncomfortable that the binding of the flow-scoped variable gets buried in a place that is harder to spot. Here's a possible compromise: Allow flow-scoped variables to leak out the bottom of a statement, but only if they are predeclared before the statement, in the parent block. They would be predeclared blank. Example: preconditions(); if (mist() && shadow() && !(x instanceof String s)) throw q; else { manyLinesOfStuff(); println(s); // s obscured by mist and shadow } <==> preconditions(); String s; if (mist() && shadow() && !(x instanceof String s)) throw q; manyLinesOfStuff(); println(s); // s obscured by mist and shadow Since the original s is always DU and never DA, we could choose to allow the flow-bound s to merge with it. The binding would then be discoverable as a dominating declaration, before the "if". In this compromise, the dominating declaration would have to be introduced before an entire if/else chain containing flow-scoped bindings that are to be passed outside of the chain. That seems reasonable to me, as a compromise between conciseness and ease of reading. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jan 8 23:14:29 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 8 Jan 2019 18:14:29 -0500 Subject: Flow scoping In-Reply-To: References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> Message-ID: <19c59b6f-4b87-484c-3531-9cd503e56fca@oracle.com> Essentially, you're saying that if someone declares a pattern variable that would shadow a DU (final, please!) local, then the variables are merged and the scope is pinned at the scope of the local.? That's nice in that the scope and declaration point are now clearer, but on the other hand the concept of binding is now muddier: pattern variables now interact with shadowing, with locals, and maybe get sucked into mutability too ("why not just treat binding as ordinary local assignment".)? So it plugs one leak, but opens up several others. On 1/8/2019 5:55 PM, John Rose wrote: > On Jan 4, 2019, at 6:07 AM, Tagir Valeev > wrote: >> >> For the record: I heavily support this. If then-branch cannot >> complete normally, then unwrapping the else-branch should preserve >> the program semantics. It works today, and it should work in future >> Java as well. > > I agree also. ?But it is uncomfortable that the binding of the flow-scoped > variable gets buried in a place that is harder to spot. > > Here's a possible compromise: ?Allow flow-scoped variables to leak > out the bottom of a statement, but only if they are predeclared before > the statement, in the parent block. ?They would be predeclared blank. > Example: > > preconditions(); > if (mist() && shadow() && !(x instanceof String s)) > ? throw q; > else { > ? manyLinesOfStuff(); > ? println(s); ?// s obscured by mist and shadow > } > > <==> > > preconditions(); > String s; > if (mist() && shadow() && !(x instanceof String s)) > ? throw q; > manyLinesOfStuff(); > println(s); ?// s obscured by mist and shadow > > Since the original s is always DU and never DA, we could choose > to allow the flow-bound s to merge with it. ?The binding would > then be discoverable as a dominating declaration, before the "if". > > In this compromise, the dominating declaration would have to > be introduced before an entire if/else chain containing flow-scoped > bindings that are to be passed outside of the chain. ?That seems > reasonable to me, as a compromise between conciseness and > ease of reading. > > ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Jan 8 23:42:28 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 8 Jan 2019 15:42:28 -0800 Subject: Flow scoping In-Reply-To: <19c59b6f-4b87-484c-3531-9cd503e56fca@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <19c59b6f-4b87-484c-3531-9cd503e56fca@oracle.com> Message-ID: On Jan 8, 2019, at 3:14 PM, Brian Goetz wrote: > > Essentially, you're saying that if someone declares a pattern variable that would shadow a DU (final, please!) local, then the variables are merged and the scope is pinned at the scope of the local. That's nice in that the scope and declaration point are now clearer, but on the other hand the concept of binding is now muddier: pattern variables now interact with shadowing, with locals, and maybe get sucked into mutability too ("why not just treat binding as ordinary local assignment".) So it plugs one leak, but opens up several others. True enough. But it seems that a lot of that machinery is already a sunk cost, for merging bindings in (a instanceof T t || b instanceof T t). The new thing would be putting blank (yes, implicitly final!) declarations into the mix as candidates for merging. That strikes me as a plausible bargain. From forax at univ-mlv.fr Wed Jan 9 00:07:26 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 9 Jan 2019 01:07:26 +0100 (CET) Subject: Flow scoping In-Reply-To: References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> Message-ID: <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "Tagir Valeev" > Cc: "amber-spec-experts" > Envoy?: Mardi 8 Janvier 2019 23:55:19 > Objet: Re: Flow scoping > On Jan 4, 2019, at 6:07 AM, Tagir Valeev < [ mailto:amaembo at gmail.com | > amaembo at gmail.com ] > wrote: >> For the record: I heavily support this. If then-branch cannot complete normally, >> then unwrapping the else-branch should preserve the program semantics. It works >> today, and it should work in future Java as well. > I agree also. But it is uncomfortable that the binding of the flow-scoped > variable gets buried in a place that is harder to spot. > Here's a possible compromise: Allow flow-scoped variables to leak > out the bottom of a statement, but only if they are predeclared before > the statement, in the parent block. They would be predeclared blank. > Example: > preconditions(); > if (mist() && shadow() && !(x instanceof String s)) > throw q; > else { > manyLinesOfStuff(); > println(s); // s obscured by mist and shadow > } > <==> > preconditions(); > String s; > if (mist() && shadow() && !(x instanceof String s)) > throw q; > manyLinesOfStuff(); > println(s); // s obscured by mist and shadow > Since the original s is always DU and never DA, we could choose > to allow the flow-bound s to merge with it. The binding would > then be discoverable as a dominating declaration, before the "if". Given that this feature will be used by beginners (writing equals is something you do early), i don't think it's a good idea. While the aim is perhaps noble, make the code more explicit, your proposal have the same kind of issues we had before 8 when the compiler was asking to declare the captured local variables final. - your proposal transform the problem from "why does it compile ?" to "why it don't compile ?". - if there is no declaration of 's', the error will be reported on 's' at the last line and most of the users will not know what to do. - the IDEs may help, but you should be able to write a basic Java program without an IDE (in jshell by example) and IDEs may declare 's' at the wrong place (the wrong scope) anyway. - Groovy, Kotlin have already this kind of feature and the branding "hey look the compiler is smart" seems enough. - i'm not even sure it makes the code more readable, at lot of users will not understand the code because it's like you can use 's' in the then branch ( 's' is declared above after all ). Basically, your proposal will just make the life of my students harder. R?mi > In this compromise, the dominating declaration would have to > be introduced before an entire if/else chain containing flow-scoped > bindings that are to be passed outside of the chain. That seems > reasonable to me, as a compromise between conciseness and > ease of reading. > ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Jan 9 01:16:24 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 8 Jan 2019 17:16:24 -0800 Subject: Flow scoping In-Reply-To: <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> Message-ID: <9123D454-2CB5-40DF-9788-BFDBB657D78E@oracle.com> On Jan 8, 2019, at 5:15 PM, John Rose wrote: > > I'm actually OK with the more concise and obscure notation, but I think > we need to note carefully where writability readability trades off against > readability, so we can tilt the language toward readability. Paste error! Delete the first of three occurrences of "readability", in order to improve? er, readability. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Jan 9 01:15:13 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 8 Jan 2019 17:15:13 -0800 Subject: Flow scoping In-Reply-To: <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> Message-ID: <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> I'm actually OK with the more concise and obscure notation, but I think we need to note carefully where writability readability trades off against readability, so we can tilt the language toward readability. On Jan 8, 2019, at 4:07 PM, Remi Forax wrote: > > While the aim is perhaps noble, make the code more explicit, your proposal have the same kind of issues we had before 8 when the compiler was asking to declare the captured local variables final. > - your proposal transform the problem from "why does it compile ?" to "why it don't compile ?". Yes, that is almost the definition of a "readability feature": It may be harder to write, but is easier to understand once written. > - if there is no declaration of 's', the error will be reported on 's' at the last line and most of the users will not know what to do. A straw man, easily fixed by an error message that points at the missed binding and says "did you mean?". > - the IDEs may help, but you should be able to write a basic Java program without an IDE (in jshell by example) and IDEs may declare 's' at the wrong place (the wrong scope) anyway. > - Groovy, Kotlin have already this kind of feature and the branding "hey look the compiler is smart" seems enough. Yep; people are already learning to read such twisty code, so that decreases the need for readability on this feature. > - i'm not even sure it makes the code more readable, at lot of users will not understand the code because it's like you can use 's' in the then branch ( 's' is declared above after all ). There's nothing new here: The 's' won't be usable on the then-branch for the usual rules that pertain to definite assignment. I'm assuming that Java students learn early that you can't use a variable that hasn't been assigned yet. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Jan 9 17:46:30 2019 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 9 Jan 2019 12:46:30 -0500 Subject: Flow scoping In-Reply-To: <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> Message-ID: <85DE48E5-5BB7-459A-9085-9AE2BC1183FF@oracle.com> John?s proposal is really interesting, and I?ve had to mull it over for several days, and in the end the best I can say is that I lean slightly against it, not because it?s ?wrong? but because I think that on balance it will require explaining more mechanism. But now I will reveal my own bias: I feel that we are debating the best way to make code clear _after_ a programmer has already chosen to trade off clarity for some other convenience. I agree that we need to support all sorts of code transformation equivalences, even o n code that is structured in what I would regard as a suboptimal manner. I?m even happy to agree that there are differences of opinion about the relative desirability of minimizing code indentation and what concessions are acceptable for achieving that goal. Still, I believe that if you really care about making the structure of the code clear, then you would be well advised to (a) avoid inverting the sense of boolean tests, and (b) avoid relying on the fact that one arm of a conditional has a control transfer so that you can ?get away with? saving a level of horizontal indentation. > On Jan 8, 2019, at 8:15 PM, John Rose wrote: > > I'm actually OK with the more concise and obscure notation, but I think > we need to note carefully where writability readability trades off against > readability, so we can tilt the language toward readability. > > On Jan 8, 2019, at 4:07 PM, Remi Forax > wrote: >> >> While the aim is perhaps noble, make the code more explicit, your proposal have the same kind of issues we had before 8 when the compiler was asking to declare the captured local variables final. >> - your proposal transform the problem from "why does it compile ?" to "why it don't compile ?". > > Yes, that is almost the definition of a "readability feature": It may be > harder to write, but is easier to understand once written. > >> - if there is no declaration of 's', the error will be reported on 's' at the last line and most of the users will not know what to do. > > A straw man, easily fixed by an error message that points at the > missed binding and says "did you mean?". > >> - the IDEs may help, but you should be able to write a basic Java program without an IDE (in jshell by example) and IDEs may declare 's' at the wrong place (the wrong scope) anyway. >> - Groovy, Kotlin have already this kind of feature and the branding "hey look the compiler is smart" seems enough. > > Yep; people are already learning to read such twisty code, so that > decreases the need for readability on this feature. > >> - i'm not even sure it makes the code more readable, at lot of users will not understand the code because it's like you can use 's' in the then branch ( 's' is declared above after all ). > > There's nothing new here: The 's' won't be usable on the then-branch > for the usual rules that pertain to definite assignment. I'm assuming > that Java students learn early that you can't use a variable that hasn't > been assigned yet. > > ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Jan 9 18:14:26 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 9 Jan 2019 13:14:26 -0500 Subject: Flow scoping In-Reply-To: <85DE48E5-5BB7-459A-9085-9AE2BC1183FF@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> <85DE48E5-5BB7-459A-9085-9AE2BC1183FF@oracle.com> Message-ID: <17f41d88-ee10-8107-d6e6-cea503e116f3@oracle.com> > Still, I believe that if you really care about making the structure of > the code clear, then you would be well advised to (a) avoid inverting > the sense of boolean tests, and (b) avoid relying on the fact that one > arm of a conditional has a control transfer so ?that you can ?get away > with? saving a level of horizontal indentation. I think the clarity knife sometimes cuts in this direction, but sometimes in the other direction. If I have: ??? if (x instanceof P(var y)) { ??????? // more than a page of code ??? } ??? else ??????? throw new FooException(); vs ??? if (!(x instanceof P(var y))) ??????? throw new FooException(); ??? // the same page of code In the latter case, i've checked all my preconditions up front, so it's more obviously fail-fast.? Maintainers are less likely to forget the condition they just tested a page ago, and readers are more able to build a mental model of the invariants of the happy path for this method.? So I think it's not always about "saving indentation"; in this case it's "get the precondition checks out of the way, and set me up to do the work without further interruption." From guy.steele at oracle.com Wed Jan 9 18:13:34 2019 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 9 Jan 2019 13:13:34 -0500 Subject: Flow scoping In-Reply-To: <17f41d88-ee10-8107-d6e6-cea503e116f3@oracle.com> References: <163A15B3-AFEB-4457-B444-B335CC90B16C@oracle.com> <597E222C-BCC1-43F2-8B95-3699C935751D@oracle.com> <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> <85DE48E5-5BB7-459A-9085-9AE2BC1183FF@oracle.com> <17f41d88-ee10-8107-d6e6-cea503e116f3@oracle.com> Message-ID: > On Jan 9, 2019, at 1:14 PM, Brian Goetz wrote: > > >> Still, I believe that if you really care about making the structure of the code clear, then you would be well advised to (a) avoid inverting the sense of boolean tests, and (b) avoid relying on the fact that one arm of a conditional has a control transfer so that you can ?get away with? saving a level of horizontal indentation. > > I think the clarity knife sometimes cuts in this direction, but sometimes in the other direction. > > If I have: > > if (x instanceof P(var y)) { > // more than a page of code > } > else > throw new FooException(); > > vs > > if (!(x instanceof P(var y))) > throw new FooException(); > > // the same page of code > > In the latter case, i've checked all my preconditions up front, so it's more obviously fail-fast. Maintainers are less likely to forget the condition they just tested a page ago, and readers are more able to build a mental model of the invariants of the happy path for this method. So I think it's not always about "saving indentation"; in this case it's "get the precondition checks out of the way, and set me up to do the work without further interruption.? Sure?and in such a situation I might still prefer the first form, _or_ I might well choose to write instead if (!(x instanceof P)) throw new FooException(); String y = ((P)x).yfield; // the same page of code and forego the slight advantage of pattern matching (perhaps relying on the compiler?s flow analysis to notice that the cast `(P)x` does not actually require a redundant run-time check), in order to make the scope of `y` crystal-clear. There are stylistic tradeoffs here, and no one style is perfect. If one style gets too squirrelly, the programmer can choose to use another. Therefore we need not always go to extremes to salvage one specific style; that?s a meta-tradeoff language designers can choose to make. From brian.goetz at oracle.com Wed Jan 9 18:44:12 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 9 Jan 2019 13:44:12 -0500 Subject: Sealed types -- updated proposal Message-ID: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> |Here's an update on the sealed type proposal based on recent discussions. | *Definition.* A /sealed type/ is one for which subclassing is restricted according to guidance specified with the type?s declaration; finality can be considered a degenerate form of sealing, where no subclasses at all are permitted. Sealed types are a sensible means of modeling /algebraic sum types/ in a nominal type hierarchy; they go nicely with records (/algebraic product types/), though are also useful on their own. Sealing serves two distinct purposes. The first, and more obvious, is that it restricts who can be a subtype. This is largely a declaration-site concern, where an API owner wants to defend the integrity of their API. The other is that it potentially enables exhaustiveness analysis at the use site when switching over sealed types (and possibly other features.) This is less obvious, and the benefit is contingent on some other things, but is valuable as it enables better compile-time type checking. *Declaration.* We specify that a class is sealed by applying the |semi-final| modifier to a class, abstract class, or interface: |semi-final interface Node { ... } | In this streamlined form, |Node| may be extended only by named classes declared in the same nest. This may be suitable for many situations, but not for all; in this case, the user may specify an explicit |permits| list: |semi-final interface Node permits FooNode, BarNode { ... } | /Note: |permits| here is a contextual keyword./ The two forms may not be combined; if there is a permits list, it must list all the permitted subtypes. We can think of the simple form as merely inferring the |permits| clause from information in the nest. *Exhaustiveness.* One of the benefits of sealing is that the compiler can enumerate the permitted subtypes of a sealed type; this in turn lets us perform exhaustiveness analysis when switching over patterns involving sealed types. Permitted subtypes must belong to the same module (or, if not in a module, the same package.) /Note:/ It is superficially tempting to have a relaxed but less explicit form, say which allows for a type to be extended by package-mates or module-mates without listing them all. However, this would undermine the compiler?s ability to reason about exhaustiveness. This would achieve the desired subclassing restrictions, but not the desired ability to reason about exhaustiveness. *Classfile.* In the classfile, a sealed type is identified with an |ACC_FINAL| modifier, and a |PermittedSubtypes| attribute which contains a list of permitted subtypes (similar in structure to the nestmate attributes.) *Transitivity.* Sealing is transitive; unless otherwise specified, an abstract subtype of a sealed type is implicitly sealed (permits list to be inferred), and a concrete subtype of a sealed type is implicitly final. This can be reversed by explicitly modifying the subtype with the |non-final| modifier. Unsealing a subtype in a hierarchy doesn?t undermine the sealing, because the (possibly inferred) set of explicitly permitted subtypes still constitutes a total covering. However, users who know about unsealed subtypes can use this information to their benefit (much like we do with exceptions today; you can catch |FileNotFoundException| separately from |IOException| if you want, but don?t have to.) /Note:/ Scala made the opposite choice with respect to transitivity, requiring sealing to be opted into at all levels. This is widely believed to be a source of bugs; it is rare that one actually wants a subtype of a sealed type to not be sealed. I suspect the reasoning in Scala was, at least partially, the desire to not make up a new keyword for ?not sealed?. This is understandable, but I?d rather not add to the list of ?things for which Java got the defaults wrong.? An example of where explicit unsealing (and private subtypes) is useful can be found in the JEP-334 API: |semi-final interface ConstantDesc permits String, Integer, Float, Long, Double, ClassDesc, MethodTypeDesc, MethodHandleDesc, DynamicConstantDesc { } semi-final interface ClassDesc extends ConstantDesc permits PrimitiveClassDescImpl, ReferenceClassDescImpl { } private class PrimitiveClassDescImpl implements ClassDesc { } private class ReferenceClassDescImpl implements ClassDesc { } semi-final interface MethodTypeDesc extends ConstantDesc permits MethodTypeDescImpl { } semi-final interface MethodHandleDesc extends ConstantDesc permits DirectMethodHandleDesc, MethodHandleDescImpl { } semi-final interface DirectMethodHandleDesc extends MethodHandleDesc permits DirectMethodHandleDescImpl // designed for subclassing non-final class DynamicConstantDesc extends ConstantDesc { ... } | *Enforcement.* Both the compiler and JVM should enforce sealing. *Accessibility.* Subtypes need not be as accessible as the sealed parent. In this case, not all clients will get the chance to exhaustively switch over them; they?ll have to make these switches exhaustive with a |default| clause or other total pattern. When compiling a switch over such a sealed type, the compiler can provide a useful error message (?I know this is a sealed type, but I can?t provide full exhaustiveness checking here because you can?t see all the subtypes, so you still need a default.?) *Javadoc.* The list of permitted subtypes should probably be considered part of the spec, and incorporated into the Javadoc. Note that this is not exactly the same as the current ?All implementing classes? list that Javadoc currently includes, so a list like ?All permitted subtypes? might be added (possibly with some indication if the subtype is less accessible than the parent.) *Auxilliary subtypes.* With the advent of records, which allow us to define classes in a single line, the ?one class per file? rule starts to seem both a little silly, and constrain the user?s ability to put related definitions together (which may be more readable) while exporting a flat namespace in the public API. One way to do get there would be to relax the ?no public auxilliary classes? rule to permit for sealed classes, say: allowing public auxilliary subtypes of the primary type, if the primary type is public and sealed. Another would be to borrow a trick from enums; for a sealed type with nested subtypes, when you |import| the sealed type, you implicitly import the nested subtypes too. That way you could declare: |semi-final interface Node { class A implements Node { } class B implements Node { } } | ?but clients could import Node and then refer to A and B directly: |switch (node) { case A(): ... case B(): ... } | We do something similar for |enum| constants today. ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 10 07:21:13 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 10 Jan 2019 08:21:13 +0100 (CET) Subject: Flow scoping In-Reply-To: References: <575567993.572436.1546992446030.JavaMail.zimbra@u-pem.fr> <2890E053-F648-43B1-BADC-D3D180F7BEE7@oracle.com> <85DE48E5-5BB7-459A-9085-9AE2BC1183FF@oracle.com> <17f41d88-ee10-8107-d6e6-cea503e116f3@oracle.com> Message-ID: <989194811.812830.1547104873982.JavaMail.zimbra@u-pem.fr> It's basically what Swift does, you have a syntactic form for if (x instanceof P(var y)) written if let y = (x as? P)?.y but it can not be inversed/negated (and you can not extract more than one variable easily). so yes the question is where to draw the line. I'm with Brian on this, given that in Java if (!(x instanceof P)) if a frequent idiom, i think it's better to support that idiom instead of saying we don't support it. if we were developing a language from scratch, i would have agree with you Guy. regards, R?mi ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mercredi 9 Janvier 2019 19:13:34 > Objet: Re: Flow scoping >> On Jan 9, 2019, at 1:14 PM, Brian Goetz wrote: >> >> >>> Still, I believe that if you really care about making the structure of the code >>> clear, then you would be well advised to (a) avoid inverting the sense of >>> boolean tests, and (b) avoid relying on the fact that one arm of a conditional >>> has a control transfer so that you can ?get away with? saving a level of >>> horizontal indentation. >> >> I think the clarity knife sometimes cuts in this direction, but sometimes in the >> other direction. >> >> If I have: >> >> if (x instanceof P(var y)) { >> // more than a page of code >> } >> else >> throw new FooException(); >> >> vs >> >> if (!(x instanceof P(var y))) >> throw new FooException(); >> >> // the same page of code >> >> In the latter case, i've checked all my preconditions up front, so it's more >> obviously fail-fast. Maintainers are less likely to forget the condition they >> just tested a page ago, and readers are more able to build a mental model of >> the invariants of the happy path for this method. So I think it's not always >> about "saving indentation"; in this case it's "get the precondition checks out >> of the way, and set me up to do the work without further interruption.? > > Sure?and in such a situation I might still prefer the first form, _or_ I might > well choose to write instead > > if (!(x instanceof P)) > throw new FooException(); > > String y = ((P)x).yfield; > // the same page of code > > and forego the slight advantage of pattern matching (perhaps relying on the > compiler?s flow analysis to notice that the cast `(P)x` does not actually > require a redundant run-time check), in order to make the scope of `y` > crystal-clear. > > There are stylistic tradeoffs here, and no one style is perfect. If one style > gets too squirrelly, the programmer can choose to use another. Therefore we > need not always go to extremes to salvage one specific style; that?s a > meta-tradeoff language designers can choose to make. From brian.goetz at oracle.com Fri Jan 11 12:47:56 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 11 Jan 2019 07:47:56 -0500 Subject: Fwd: Hyphenated keywords and switch expressions References: Message-ID: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> Received from the -comments list. > Begin forwarded message: > > From: Ben Evans > Subject: Hyphenated keywords and switch expressions > Date: January 11, 2019 at 5:19:25 AM EST > To: amber-spec-comments at openjdk.java.net > > Hi EG members, > > I had a couple of comments on hyphenated keywords. > > First off, I think they're a great idea, if used judiciously. One of > Java's great strengths when teaching it to beginners is the simplicity > and explicit regularity of the grammar. A few more keywords, keeping > clear the distinction between keywords and other language constructs, > is IMO a good thing - especially when it allows us to tidy up and > explicitly express language ideas that we currently can't. > > For example, the lack of package-private as an explicit keyword is one > of the most common sources of confusion and errors I see in new Java > devs - especially those who already know other programming languages. > If we are going to do hyphenated > > Secondly, a question. Is it worth reconsidering the choice of keyword > used for switch expressions with hyphenation in mind? I re-read the > discussion about not wanting to see switch statements abandoned, and a > lack of consistency by using a new match keyword (imposed by eminent > domain), and I can see the validity of a lot of the points made. > > However, with hyphenated keywords, we have other possibilities - so > what about using something like switch-expr (or switch-expression) > instead? With the rules: > > 1. switch is only legal in statement form (the current Java 11 behaviour) > 2. switch-expr is only legal for the expression form > > To my mind, this helps in a few ways: > > a) It maintains the cognitive connection between switch expressions > and switch statements, and doesn't lead to the feeling that switch > statements are abandonware > b) It provides a clear clue to newcomers about the distinction between > the two forms (bear in mind that beginners often don't develop a full > grasp of the distinction between statements and expressions until they > have gained some proficiency with the language. The differentiation of > the keyword could help provide visual clues and avoid > hard-to-understand-if-you're-a-newbie compiler errors when debugging) > c) There is no cognitive overhead for experienced programmers. > d) IDEs will easily be able to autodetect and offer to correct if the > wrong keyword is used "Did you mean switch-expr instead of switch?" > > I'd be really interested to hear what people think - the above is very > much with my "teaching newbies" hat on, and I know that's far from the > only concern here. > > Cheers, > > Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Fri Jan 11 13:32:04 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 11 Jan 2019 20:32:04 +0700 Subject: Hyphenated keywords and switch expressions In-Reply-To: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> Message-ID: Hello! To my personal taste making longer keyword for switch expression is bad. Expressions should be more compact compared to statements, and switch expression is already quite verbose (e.g. compared to ?: which is close to two-case switch expression). So having a separate keyword like `switch-expr` would only make the things worse. On the other hand, from IDE developer point of view, having expression and statement with so similar syntax definitely adds a confusion to the parsing (and probably to users). E.g. suppose we want to parse a fragment which consists of a number of statements, isolated from other code: switch(0) { default -> throw new Exception(); }; In normal context it's two statements: switch-statement followed by an empty statement. However inside switch expression rule it's one statement: an expression statement containing a switch expression: int x = switch(0) { default -> switch(0) { default -> throw new Exception(); }; }; Normally if we take a textual representation of single statement, it could be parsed back into the same single statement, when isolated from other code (the same works for expressions). Here this rule is violated: the expression statement taken from switch expression rule could be reparsed in isolation as two statements. This poses a problem for us as we not only parse complete program, but often perform code transformations during refactorings/quick-fixes which assumes code fragments joggling / reparsing. Surely we can do something with this, but this is ugly. Of course separate keywords would solve the problem. Another a little bit confusing thing is lambdas. E.g.: IntSupplier s = () -> switch(0) { default -> throw new Exception(); }; -- this code compiles Runnable s = () -> switch(0) { default -> throw new Exception(); }; -- this doesn't, need to wrap switch into braces One may think that the former is a switch statement, but actually it's an expression. It would be more clear were we had different keywords. So there are ambiguities and confusing cases. Still I think they are bearable and we can live with them. More interesting question is whether we can change `break` keyword on the switch expressions. As it's rarely necessary, to my opinion it's perfectly ok to make it longer. Disambiguating label-break and value-break in IDE code is also somewhat ugly. Again as we may joggle with statements we cannot interpret whether it's label or value without proper context and sometimes the meaning could change inadvertently. These are just some thoughts to add to the overall picture. In general we're still fine with current spec. With best regards, Tagir Valeev. On Fri, Jan 11, 2019 at 7:48 PM Brian Goetz wrote: > > Received from the -comments list. > > Begin forwarded message: > > From: Ben Evans > Subject: Hyphenated keywords and switch expressions > Date: January 11, 2019 at 5:19:25 AM EST > To: amber-spec-comments at openjdk.java.net > > Hi EG members, > > I had a couple of comments on hyphenated keywords. > > First off, I think they're a great idea, if used judiciously. One of > Java's great strengths when teaching it to beginners is the simplicity > and explicit regularity of the grammar. A few more keywords, keeping > clear the distinction between keywords and other language constructs, > is IMO a good thing - especially when it allows us to tidy up and > explicitly express language ideas that we currently can't. > > For example, the lack of package-private as an explicit keyword is one > of the most common sources of confusion and errors I see in new Java > devs - especially those who already know other programming languages. > If we are going to do hyphenated > > Secondly, a question. Is it worth reconsidering the choice of keyword > used for switch expressions with hyphenation in mind? I re-read the > discussion about not wanting to see switch statements abandoned, and a > lack of consistency by using a new match keyword (imposed by eminent > domain), and I can see the validity of a lot of the points made. > > However, with hyphenated keywords, we have other possibilities - so > what about using something like switch-expr (or switch-expression) > instead? With the rules: > > 1. switch is only legal in statement form (the current Java 11 behaviour) > 2. switch-expr is only legal for the expression form > > To my mind, this helps in a few ways: > > a) It maintains the cognitive connection between switch expressions > and switch statements, and doesn't lead to the feeling that switch > statements are abandonware > b) It provides a clear clue to newcomers about the distinction between > the two forms (bear in mind that beginners often don't develop a full > grasp of the distinction between statements and expressions until they > have gained some proficiency with the language. The differentiation of > the keyword could help provide visual clues and avoid > hard-to-understand-if-you're-a-newbie compiler errors when debugging) > c) There is no cognitive overhead for experienced programmers. > d) IDEs will easily be able to autodetect and offer to correct if the > wrong keyword is used "Did you mean switch-expr instead of switch?" > > I'd be really interested to hear what people think - the above is very > much with my "teaching newbies" hat on, and I know that's far from the > only concern here. > > Cheers, > > Ben > > From brian.goetz at oracle.com Fri Jan 11 15:58:51 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 11 Jan 2019 10:58:51 -0500 Subject: Fwd: Multiple return values References: Message-ID: <8D7102FF-4B48-49C3-9C3A-3E6738D39005@oracle.com> Received on the comments list. > Begin forwarded message: > > From: Lukas Eder > Subject: Multiple return values > Date: January 11, 2019 at 10:57:19 AM EST > To: amber-spec-comments at openjdk.java.net > > Hello, > > I'm referring to the exciting proposed new features around destructuring > values from records and other types as shown here: > https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#pattern-bind-statements > > The example given was: > > Rect r = ... > __let Rect(var p0, var p1) = r; > // use p0, p1 > > This is a very useful construct, which I have liked using in other > languages a lot. Just today, I had a similar use case where I would have > liked to be able to do something like this, but without declaring a nominal > type Rect. Every now and then, I would like to return more than one value > from a method. For example: > > private X, Y method() { > X x = ... > Y y = ... > return x, y; > } > > I would then call this method as follows (hypothetical syntax. Many other > syntaxes are possible, e.g. syntaxes that make expressions look like > tuples, or actual tuples of course): > > X x, Y y = method(); > > The rationale is that I don't (always) want to: > > - Modify either type X or Y, because this is just one little method where I > want to indicate to the call site of method() in what context they should > interpret X by providing a context Y > - Wrap X and Y in a new type, because creating new types is too much work > - Wrap X and Y in Object[] because that's just dirty > - Rely on escape analysis for some wrapper type (minor requirement for me) > - Assign both X and Y. Something like "X x, _ = method()" or "_, Y y = > method()" would be useful, too. > > I was wondering if in the context of all the work going on in Amber around > capturing local variables, etc. if something like this is reasonably > possible as well in some future Java. > > This is possible in Python. Go uses this syntax to return exceptions. > > Thanks, > Lukas -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Jan 11 16:08:17 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 11 Jan 2019 11:08:17 -0500 Subject: Multiple return values In-Reply-To: References: Message-ID: While I understand where you?re coming from, I think multiple return is likely to be both more intrusive and less satisfying than it first appears. First, it?s a relatively deep cut; it goes all the way down to method descriptors, since methods in the JVM can only return a single thing. So what you?re really asking the compiler to do is create an anonymous record (whose denotation must be stable as it will be burned into client classfiles.) That?s the ?more intrusive? part. The ?less satisfying? part is that if you can return multiple values: return (x, y) and then obviously you need a way to destructure multiple values: (x, y) = method() (since otherwise, what would you do with the return value?) But here?s where people will hate you: why can I use tuples as return values, and destructure them into locals, but not use them as method arguments, or type parameters? Now I can?t compose someMethod(method()) because I can?t denote the return type of method() as a parameter type. And I can use your multiple-returning method in a stream map: Stream s = aStream.map(Lukas::method) // stream of what? When we tug on this string, we?ll be very disappointed that it?s not tied to anything. Instead, what you can do is expose records in your APIs: ``` class MyAPI { record Range(int lo, int hi) { ? } Range method() { ? } } ``` and now a caller gets a Range back, which is a denotable type and whose components have descriptive names. You say you don?t want to do this because creating new types is so much work. Is the one-line declaration of `Range` above really so much work? (Ignoring the fact that returning a Range is far more descriptive than returning an (int, int) pair.) > On Jan 11, 2019, at 10:57 AM, Lukas Eder wrote: > > Hello, > > I'm referring to the exciting proposed new features around destructuring > values from records and other types as shown here: > https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#pattern-bind-statements > > The example given was: > > Rect r = ... > __let Rect(var p0, var p1) = r; > // use p0, p1 > > This is a very useful construct, which I have liked using in other > languages a lot. Just today, I had a similar use case where I would have > liked to be able to do something like this, but without declaring a nominal > type Rect. Every now and then, I would like to return more than one value > from a method. For example: > > private X, Y method() { > X x = ... > Y y = ... > return x, y; > } > > I would then call this method as follows (hypothetical syntax. Many other > syntaxes are possible, e.g. syntaxes that make expressions look like > tuples, or actual tuples of course): > > X x, Y y = method(); > > The rationale is that I don't (always) want to: > > - Modify either type X or Y, because this is just one little method where I > want to indicate to the call site of method() in what context they should > interpret X by providing a context Y > - Wrap X and Y in a new type, because creating new types is too much work > - Wrap X and Y in Object[] because that's just dirty > - Rely on escape analysis for some wrapper type (minor requirement for me) > - Assign both X and Y. Something like "X x, _ = method()" or "_, Y y = > method()" would be useful, too. > > I was wondering if in the context of all the work going on in Amber around > capturing local variables, etc. if something like this is reasonably > possible as well in some future Java. > > This is possible in Python. Go uses this syntax to return exceptions. > > Thanks, > Lukas From brian.goetz at oracle.com Fri Jan 11 19:16:10 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 11 Jan 2019 14:16:10 -0500 Subject: Fwd: Raw string literals: learning from Swift References: <6761DB40-0EA5-43B1-9343-D4B99764FE4E@icloud.com> Message-ID: <667934E4-2883-4E73-9B56-C27F2140D59C@oracle.com> Received on the -comments list. > Begin forwarded message: > > From: Fred Curts > Subject: Raw string literals: learning from Swift > Date: January 11, 2019 at 2:15:10 PM EST > To: amber-spec-comments at openjdk.java.net > > With Swift 5 recently adding custom String delimiters (also called raw string literals), I find the design of Swift's string literals very compelling, more so than other languages I've studied. > > https://github.com/apple/swift-evolution/blob/master/proposals/0200-raw-string-escaping.md (implemented in Swift 5) > https://github.com/apple/swift-evolution/blob/master/proposals/0168-multi-line-string-literals.md (implemented in Swift 4) > https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html#ID286 > > Here is what I like about Swift's string literals. In no particular order: > > 1. Multi-line and raw string literals are orthogonal features. > (Try adding a literal dollar sign to a Kotlin multi-line string literal and you'll know what I mean.) > > 2. Custom string delimiters solve all the use cases for raw string literals but nevertheless support escape sequences and interpolation expressions. > I've personally come across this need many times when trying to build larger regular expressions or code snippets out of smaller ones. > > 3. Escape sequences and interpolation expressions use the same escape character. > This simplifies matters considerably, in particular once custom string delimiters are added to the mix. > (Having multiple custom escape characters would be too much.) > > 4. Multi-line string literals are delimited by triple double quotes. > This makes them visually compatible with but heavier than single-line string literals, which seems like a good fit. > Distinct delimiters for single-line and multi-line string literals seem like a win for both humans and parsers. > For example, it's easy to tell where the missing end quote of a single-line string literal belongs. > > 5. It's easy to control line indentation of multi-line string literals and leading and trailing whitespace of the entire string. > All of this is settled at compile time. > > 6. Opening and closing delimiters of multi-line string literals must be on their own line. > This avoids headaches with edge cases such as string literals ending with two double quotes. > > 7. Multi-line string literals with custom string delimiters can contain arbitrarily long sequences of double quotes. > > I hope I've convinced you that the design of Swift's string literals is worth a closer look. > > -Fred -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Jan 11 20:23:11 2019 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 11 Jan 2019 15:23:11 -0500 Subject: Raw string literals: learning from Swift In-Reply-To: <667934E4-2883-4E73-9B56-C27F2140D59C@oracle.com> References: <6761DB40-0EA5-43B1-9343-D4B99764FE4E@icloud.com> <667934E4-2883-4E73-9B56-C27F2140D59C@oracle.com> Message-ID: <6110975F-0D8E-4537-9C25-CE853CBFFBA9@oracle.com> I like it. There is an advantage to using a visually heavyweight character like ?#?. If you don?t want to use that, I think ?$? would work. (I considered ?%?, but there are two problems: in familiar usage it occurs a lot in format strings (even more than ?$?), and moreover in principle the string %? can already occur in a legitimate Java program (consider `myVar%?foobaz?.substring(k).length()`), but I think $? cannot.) I like that it leaves open a variety of escape constructions for possible future use. > On Jan 11, 2019, at 2:16 PM, Brian Goetz wrote: > > Received on the -comments list. > >> Begin forwarded message: >> >> From: Fred Curts > >> Subject: Raw string literals: learning from Swift >> Date: January 11, 2019 at 2:15:10 PM EST >> To: amber-spec-comments at openjdk.java.net >> >> With Swift 5 recently adding custom String delimiters (also called raw string literals), I find the design of Swift's string literals very compelling, more so than other languages I've studied. >> >> https://github.com/apple/swift-evolution/blob/master/proposals/0200-raw-string-escaping.md > (implemented in Swift 5) >> https://github.com/apple/swift-evolution/blob/master/proposals/0168-multi-line-string-literals.md > (implemented in Swift 4) >> https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html#ID286 > >> >> Here is what I like about Swift's string literals. In no particular order: >> >> 1. Multi-line and raw string literals are orthogonal features. >> (Try adding a literal dollar sign to a Kotlin multi-line string literal and you'll know what I mean.) >> >> 2. Custom string delimiters solve all the use cases for raw string literals but nevertheless support escape sequences and interpolation expressions. >> I've personally come across this need many times when trying to build larger regular expressions or code snippets out of smaller ones. >> >> 3. Escape sequences and interpolation expressions use the same escape character. >> This simplifies matters considerably, in particular once custom string delimiters are added to the mix. >> (Having multiple custom escape characters would be too much.) >> >> 4. Multi-line string literals are delimited by triple double quotes. >> This makes them visually compatible with but heavier than single-line string literals, which seems like a good fit. >> Distinct delimiters for single-line and multi-line string literals seem like a win for both humans and parsers. >> For example, it's easy to tell where the missing end quote of a single-line string literal belongs. >> >> 5. It's easy to control line indentation of multi-line string literals and leading and trailing whitespace of the entire string. >> All of this is settled at compile time. >> >> 6. Opening and closing delimiters of multi-line string literals must be on their own line. >> This avoids headaches with edge cases such as string literals ending with two double quotes. >> >> 7. Multi-line string literals with custom string delimiters can contain arbitrarily long sequences of double quotes. >> >> I hope I've convinced you that the design of Swift's string literals is worth a closer look. >> >> -Fred > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Sat Jan 12 00:42:47 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 11 Jan 2019 16:42:47 -0800 Subject: Hyphenated keywords and switch expressions In-Reply-To: References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> Message-ID: <5C393807.8000500@oracle.com> Hi Tagir, On 1/11/2019 5:32 AM, Tagir Valeev wrote: > On the other hand, from IDE developer point of view, having expression > and statement with so similar syntax definitely adds a confusion to > the parsing (and probably to users). E.g. suppose we want to parse a > fragment which consists of a number of statements, isolated from other > code: > > switch(0) { default -> throw new Exception(); }; > > In normal context it's two statements: switch-statement followed by an > empty statement. However inside switch expression rule it's one > statement: an expression statement containing a switch expression: > > int x = switch(0) { default -> switch(0) { default -> throw new > Exception(); }; }; > > Normally if we take a textual representation of single statement, it > could be parsed back into the same single statement, when isolated > from other code (the same works for expressions). Here this rule is > violated: the expression statement taken from switch expression rule > could be reparsed in isolation as two statements. I'm concerned about any claim of ambiguity in the grammar, though I'm not sure I'm following you correctly. I agree that your first fragment is parsed as two statements -- a switch statement and an empty statement -- but I don't know what you mean about "inside switch expression rule" for your second fragment. A switch expression is not an expression statement (JLS 14.8). In your second fragment, the leftmost default label is followed not by a block or a throw statement but by an expression (`switch (0) {...}`, a unary expression) and a semicolon. Yes, the phrase `switch (0) {...}` is parsed as a switch statement in one context and as a unary expression in another context. Is that the ambiguity you wished to highlight? Alex From amaembo at gmail.com Sun Jan 13 10:53:35 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 13 Jan 2019 17:53:35 +0700 Subject: Hyphenated keywords and switch expressions In-Reply-To: <5C393807.8000500@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> Message-ID: Hello! > I'm concerned about any claim of ambiguity in the grammar, though I'm > not sure I'm following you correctly. I agree that your first fragment > is parsed as two statements -- a switch statement and an empty statement > -- but I don't know what you mean about "inside switch expression rule" > for your second fragment. A switch expression is not an expression > statement (JLS 14.8). In your second fragment, the leftmost default > label is followed not by a block or a throw statement but by an > expression (`switch (0) {...}`, a unary expression) and a semicolon. Ah, ok, we moved away slightly from the spec draft [1]. I was not aware, because I haven't wrote parser by myself. The draft says: SwitchLabeledRule: SwitchLabeledExpression SwitchLabeledBlock SwitchLabeledThrowStatement SwitchLabeledExpression: SwitchLabel -> Expression ; SwitchLabeledBlock: SwitchLabel -> Block ; SwitchLabeledThrowStatement: SwitchLabel -> ThrowStatement ; (by the way I think that ; after block and throw should not be present: current implementation does not require it after the block and throw statement already includes a ; inside it). Instead we implement it like: SwitchLabeledRule: SwitchLabel -> SwitchLabeledRuleStatement SwitchLabeledRuleStatement: ExpressionStatement Block ThrowStatement So we assume that the right part of SwitchLabeledRule is always a statement and reused ExpressionStatement to express Expression plus semicolon, because syntactically it looks the same. Strictly following a spec draft here looks even more ugly, because it requires more object types in our code model and reduces the flexibility when we need to perform code transformation. E.g. if we want to wrap expression into block, currently we just need to replace an ExpressionStatement with a Block not touching a SwitchLabel at all. Had we mirrored the spec in our code model, we would need to replace SwitchLabeledExpression with SwitchLabeledBlock which looks more annoying. With best regards, Tagir Valeev [1] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11 From forax at univ-mlv.fr Mon Jan 14 11:20:43 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 14 Jan 2019 12:20:43 +0100 (CET) Subject: Multiple return values In-Reply-To: References: Message-ID: <1690343428.117805.1547464843466.JavaMail.zimbra@u-pem.fr> You can have both ! This is basically what we are doing with lambdas, you have a structural syntax + a named type that are bound together using inference. Let say we have a tuple keyword that means, value + record + constructor/de-constructor tuple Range(int lo, int hi) { ? } then you can write: Range method(int x) { return (x, x + 1); // the compiler infers "new Range(x, x + 1)" } and also var (x, y) = method(); // the compiler uses the de-constructor or the record getters if there is no de-constructor With Stream s = aStream.map(Lukas::method), ??? is a Range, and if someone want to use the tuple syntax inside a call to map(), a type as to be provided, by example aStream.map(x -> (x, x + 1)).collect(...) R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Lukas Eder" > Cc: "amber-spec-comments" > Envoy?: Vendredi 11 Janvier 2019 17:07:43 > Objet: Re: Multiple return values > While I understand where you?re coming from, I think multiple return is likely > to be both more intrusive and less satisfying than it first appears. > > First, it?s a relatively deep cut; it goes all the way down to method > descriptors, since methods in the JVM can only return a single thing. So what > you?re really asking the compiler to do is create an anonymous record (whose > denotation must be stable as it will be burned into client classfiles.) That?s > the ?more intrusive? part. > > The ?less satisfying? part is that if you can return multiple values: > > return (x, y) > > and then obviously you need a way to destructure multiple values: > > (x, y) = method() > > (since otherwise, what would you do with the return value?) > > But here?s where people will hate you: why can I use tuples as return values, > and destructure them into locals, but not use them as method arguments, or type > parameters? Now I can?t compose > > someMethod(method()) > > because I can?t denote the return type of method() as a parameter type. And I > can use your multiple-returning method in a stream map: > > Stream s = aStream.map(Lukas::method) // stream of what? > > When we tug on this string, we?ll be very disappointed that it?s not tied to > anything. > > > Instead, what you can do is expose records in your APIs: > > ``` > class MyAPI { > record Range(int lo, int hi) { ? } > > Range method() { ? } > } > ``` > > and now a caller gets a Range back, which is a denotable type and whose > components have descriptive names. > > You say you don?t want to do this because creating new types is so much work. > Is the one-line declaration of `Range` above really so much work? (Ignoring > the fact that returning a Range is far more descriptive than returning an (int, > int) pair.) > > > >> On Jan 11, 2019, at 10:57 AM, Lukas Eder wrote: >> >> Hello, >> >> I'm referring to the exciting proposed new features around destructuring >> values from records and other types as shown here: >> https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#pattern-bind-statements >> >> The example given was: >> >> Rect r = ... >> __let Rect(var p0, var p1) = r; >> // use p0, p1 >> >> This is a very useful construct, which I have liked using in other >> languages a lot. Just today, I had a similar use case where I would have >> liked to be able to do something like this, but without declaring a nominal >> type Rect. Every now and then, I would like to return more than one value >> from a method. For example: >> >> private X, Y method() { >> X x = ... >> Y y = ... >> return x, y; >> } >> >> I would then call this method as follows (hypothetical syntax. Many other >> syntaxes are possible, e.g. syntaxes that make expressions look like >> tuples, or actual tuples of course): >> >> X x, Y y = method(); >> >> The rationale is that I don't (always) want to: >> >> - Modify either type X or Y, because this is just one little method where I >> want to indicate to the call site of method() in what context they should >> interpret X by providing a context Y >> - Wrap X and Y in a new type, because creating new types is too much work >> - Wrap X and Y in Object[] because that's just dirty >> - Rely on escape analysis for some wrapper type (minor requirement for me) >> - Assign both X and Y. Something like "X x, _ = method()" or "_, Y y = >> method()" would be useful, too. >> >> I was wondering if in the context of all the work going on in Amber around >> capturing local variables, etc. if something like this is reasonably >> possible as well in some future Java. >> >> This is possible in Python. Go uses this syntax to return exceptions. >> >> Thanks, > > Lukas From brian.goetz at oracle.com Mon Jan 14 16:29:54 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 14 Jan 2019 11:29:54 -0500 Subject: Multiple return values In-Reply-To: <1690343428.117805.1547464843466.JavaMail.zimbra@u-pem.fr> References: <1690343428.117805.1547464843466.JavaMail.zimbra@u-pem.fr> Message-ID: I was trying to keep my reply focused on concepts (nominal product types rather than tuples), rather than ad-hoc syntactic tricks (which have the risk of blurring the distinction.) You're basically proposing several things: ?- Define a distinguished sub-category of records that have extra features; ?- For that sub-category, provide an ad-hoc, tuple-like construction syntax; ?- For that sub-category, provide an ad-hoc, tuple-like destructuring syntax. I think the first is a mistake, and I don't think we really need either of the latter two -- and I think having them may well be more confusing.? If you're returning a Range, then saying ??? x -> new Range(x,x+10) is saying what you mean, and it's not very painful. Similarly, there's no need to have an ad-hoc destructuring syntax for records; we _already_ have that, which is pattern matching. Whatever the syntax of "unconditional bind" is going to be, you'd be able to say something like ??? __unconditional_bind Range(var lo, var hi) = getRange(...) with an ordinary destructuring pattern.? No need for an explicit pseudo-tuple concept, and no need for an ad-hoc tuple-like destructuring mechanism. If, at the end of the game, we decide that the straight denotation of pattern matching isn't enough, we can revisit.? But the answer to Lukas' question is basically: ?- Multiple return is a weak feature, that will feel like tuples dangled in front of your face, and then snatched back; ?- We prefer nominal tuple (records) to structural ones; ?- Records will have a compact (usually one line) declaration, a compact construction syntax, and a compact destructuring syntax. On 1/14/2019 6:20 AM, Remi Forax wrote: > You can have both ! > This is basically what we are doing with lambdas, you have a structural syntax + a named type that are bound together using inference. > > Let say we have a tuple keyword that means, value + record + constructor/de-constructor > tuple Range(int lo, int hi) { ? } > > then you can write: > Range method(int x) { > return (x, x + 1); // the compiler infers "new Range(x, x + 1)" > } > > and also > var (x, y) = method(); // the compiler uses the de-constructor or the record getters if there is no de-constructor > > > With Stream s = aStream.map(Lukas::method), ??? is a Range, > and if someone want to use the tuple syntax inside a call to map(), a type as to be provided, > by example > aStream.map(x -> (x, x + 1)).collect(...) > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Lukas Eder" >> Cc: "amber-spec-comments" >> Envoy?: Vendredi 11 Janvier 2019 17:07:43 >> Objet: Re: Multiple return values >> While I understand where you?re coming from, I think multiple return is likely >> to be both more intrusive and less satisfying than it first appears. >> >> First, it?s a relatively deep cut; it goes all the way down to method >> descriptors, since methods in the JVM can only return a single thing. So what >> you?re really asking the compiler to do is create an anonymous record (whose >> denotation must be stable as it will be burned into client classfiles.) That?s >> the ?more intrusive? part. >> >> The ?less satisfying? part is that if you can return multiple values: >> >> return (x, y) >> >> and then obviously you need a way to destructure multiple values: >> >> (x, y) = method() >> >> (since otherwise, what would you do with the return value?) >> >> But here?s where people will hate you: why can I use tuples as return values, >> and destructure them into locals, but not use them as method arguments, or type >> parameters? Now I can?t compose >> >> someMethod(method()) >> >> because I can?t denote the return type of method() as a parameter type. And I >> can use your multiple-returning method in a stream map: >> >> Stream s = aStream.map(Lukas::method) // stream of what? >> >> When we tug on this string, we?ll be very disappointed that it?s not tied to >> anything. >> >> >> Instead, what you can do is expose records in your APIs: >> >> ``` >> class MyAPI { >> record Range(int lo, int hi) { ? } >> >> Range method() { ? } >> } >> ``` >> >> and now a caller gets a Range back, which is a denotable type and whose >> components have descriptive names. >> >> You say you don?t want to do this because creating new types is so much work. >> Is the one-line declaration of `Range` above really so much work? (Ignoring >> the fact that returning a Range is far more descriptive than returning an (int, >> int) pair.) >> >> >> >>> On Jan 11, 2019, at 10:57 AM, Lukas Eder wrote: >>> >>> Hello, >>> >>> I'm referring to the exciting proposed new features around destructuring >>> values from records and other types as shown here: >>> https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#pattern-bind-statements >>> >>> The example given was: >>> >>> Rect r = ... >>> __let Rect(var p0, var p1) = r; >>> // use p0, p1 >>> >>> This is a very useful construct, which I have liked using in other >>> languages a lot. Just today, I had a similar use case where I would have >>> liked to be able to do something like this, but without declaring a nominal >>> type Rect. Every now and then, I would like to return more than one value >>> from a method. For example: >>> >>> private X, Y method() { >>> X x = ... >>> Y y = ... >>> return x, y; >>> } >>> >>> I would then call this method as follows (hypothetical syntax. Many other >>> syntaxes are possible, e.g. syntaxes that make expressions look like >>> tuples, or actual tuples of course): >>> >>> X x, Y y = method(); >>> >>> The rationale is that I don't (always) want to: >>> >>> - Modify either type X or Y, because this is just one little method where I >>> want to indicate to the call site of method() in what context they should >>> interpret X by providing a context Y >>> - Wrap X and Y in a new type, because creating new types is too much work >>> - Wrap X and Y in Object[] because that's just dirty >>> - Rely on escape analysis for some wrapper type (minor requirement for me) >>> - Assign both X and Y. Something like "X x, _ = method()" or "_, Y y = >>> method()" would be useful, too. >>> >>> I was wondering if in the context of all the work going on in Amber around >>> capturing local variables, etc. if something like this is reasonably >>> possible as well in some future Java. >>> >>> This is possible in Python. Go uses this syntax to return exceptions. >>> >>> Thanks, >>> Lukas From alex.buckley at oracle.com Mon Jan 14 20:09:00 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 14 Jan 2019 12:09:00 -0800 Subject: Hyphenated keywords and switch expressions In-Reply-To: References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> Message-ID: <5C3CEC5C.9090801@oracle.com> Hi Tagir, On 1/13/2019 2:53 AM, Tagir Valeev wrote: > Ah, ok, we moved away slightly from the spec draft [1]. I was not > aware, because I haven't wrote parser by myself. The draft says: > > SwitchLabeledRule: > SwitchLabeledExpression > SwitchLabeledBlock > SwitchLabeledThrowStatement > > SwitchLabeledExpression: > SwitchLabel -> Expression ; > SwitchLabeledBlock: > SwitchLabel -> Block ; > SwitchLabeledThrowStatement: > SwitchLabel -> ThrowStatement ; > > Instead we implement it like: > > SwitchLabeledRule: > SwitchLabel -> SwitchLabeledRuleStatement > SwitchLabeledRuleStatement: > ExpressionStatement > Block > ThrowStatement > > So we assume that the right part of SwitchLabeledRule is always a > statement and reused ExpressionStatement to express Expression plus > semicolon, because syntactically it looks the same. That's an odd assumption, because SwitchLabeledRule appears in the SwitchBlock of a SwitchExpression, and we obviously intend any kind of expression to be allowed after the -> in a switch expression. That is, in a switch expression, what comes after the -> is not just an expression statement (x=y, ++x, --x, x++, x--, x.m(), new X()) but any expression (including this, X.class, x.f, x[i], x::m). Only in a switch statement do we restrict the kind of expression allowed after the -> but that's a semantic rule (14.11.2), not syntactic, in order to share the grammar between switch expressions and switch statements. > Strictly following a spec draft here looks even more ugly, because it > requires more object types in our code model and reduces the > flexibility when we need to perform code transformation. E.g. if we > want to wrap expression into block, currently we just need to replace > an ExpressionStatement with a Block not touching a SwitchLabel at > all. Had we mirrored the spec in our code model, we would need to > replace SwitchLabeledExpression with SwitchLabeledBlock which looks > more annoying. Understood. The grammar is specified like it is in order to introduce the critical terms "switch labeled expression", "switch labeled block", and "switch labeled throw statement". Aligning production names with critical terms is longstanding JLS style. Alex From alex.buckley at oracle.com Mon Jan 14 20:40:11 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 14 Jan 2019 12:40:11 -0800 Subject: Hyphenated keywords and switch expressions In-Reply-To: References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> Message-ID: <5C3CF3AB.5030602@oracle.com> Hi Gavin, Some points driven partly by the discussion with Tagir: 1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- there is no indication in JEP 325 that a semicolon is desired after `-> {...}` and javac in JDK 12 does not accept one there. Also, SwitchLabeledThrowStatement should not end with a `;` because ThrowStatement includes a `;`. 2. In 14.11.1, "This block can either be empty, or take one of two forms:" is wrong for switch expressions. The emptiness allowed by the grammar will be banned semantically in 15.29.1, so 14.11.1 should avoid trouble by speaking broadly of the forms in an educational tone: "A switch block can consist of either: - _Switch labeled rules_, which use `->` to introduce either a _switch labeled expression_, ..." Also, "optionally followed by switch labels." is wrong for switch expressions, so prefer: "- _Switch labeled statement groups_, which use `:` to introduce block statements." 3. In 15.29.1: (this is mainly driven by eyeballing against 14.11.2) - Incorrect Markdown in section header. - The error clause in the following bullet is redundant because the list header already called for an error: "The switch block must be compatible with the type of the selector expression, *****or a compile-time error occurs*****." - I would prefer to pull the choice of {default label, enum typed selector expression} into a fourth bullet of the prior list, to align how 14.11.2's list has a bullet concerning default label. - The significant rule from 14.11.2 that "If the switch block consists of switch labeled rules, then any switch labeled expression must be a statement expression (14.8)." has no parallel in 15.29.1. Instead, for switch labeled rules, 15.29.1 has a rule for switch labeled blocks. (1) We haven't seen switch labeled blocks for ages, so a cross-ref to 14.11.1 is due. (2) A note that switch exprs allow `-> ANY_EXPRESSION` while switch statements allow `-> NOT_ANY_EXPRESSION` is due in both sections; grep ch.8 for "In this respect" to see what I mean. (3) The semantic constraints on switch labeled rules+statement groups in 15.29.1 should be easily contrastable with those in 14.11.2 -- one approach is to pull the following constraints into 15.29.1's "all conditions true, or error" list: ----- - If the switch block consists of switch labeled rules, then any switch labeled block (14.11.1) MUST COMPLETE ABRUPTLY. - If the switch block consists of switch labeled statement groups, then the last statement in the switch block MUST COMPLETE ABRUPTLY, and the switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH LABELED STATEMENT GROUP. ----- If you prefer to keep these semantic constraints standalone so that they have negative polarity, then 14.11.2 should to the same for its significant-but-easily-missed "must be a statement expression" constraint. Alex On 1/13/2019 2:53 AM, Tagir Valeev wrote: > Hello! > >> I'm concerned about any claim of ambiguity in the grammar, though I'm >> not sure I'm following you correctly. I agree that your first fragment >> is parsed as two statements -- a switch statement and an empty statement >> -- but I don't know what you mean about "inside switch expression rule" >> for your second fragment. A switch expression is not an expression >> statement (JLS 14.8). In your second fragment, the leftmost default >> label is followed not by a block or a throw statement but by an >> expression (`switch (0) {...}`, a unary expression) and a semicolon. > > Ah, ok, we moved away slightly from the spec draft [1]. I was not > aware, because I haven't wrote parser by myself. The draft says: > > SwitchLabeledRule: > SwitchLabeledExpression > SwitchLabeledBlock > SwitchLabeledThrowStatement > > SwitchLabeledExpression: > SwitchLabel -> Expression ; > SwitchLabeledBlock: > SwitchLabel -> Block ; > SwitchLabeledThrowStatement: > SwitchLabel -> ThrowStatement ; > > (by the way I think that ; after block and throw should not be > present: current implementation does not require it after the block > and throw statement already includes a ; inside it). > > Instead we implement it like: > > SwitchLabeledRule: > SwitchLabel -> SwitchLabeledRuleStatement > SwitchLabeledRuleStatement: > ExpressionStatement > Block > ThrowStatement > > So we assume that the right part of SwitchLabeledRule is always a > statement and reused ExpressionStatement to express Expression plus > semicolon, because syntactically it looks the same. Strictly following > a spec draft here looks even more ugly, because it requires more > object types in our code model and reduces the flexibility when we > need to perform code transformation. E.g. if we want to wrap > expression into block, currently we just need to replace an > ExpressionStatement with a Block not touching a SwitchLabel at all. > Had we mirrored the spec in our code model, we would need to replace > SwitchLabeledExpression with SwitchLabeledBlock which looks more > annoying. > > With best regards, > Tagir Valeev > > [1] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11 > From alex.buckley at oracle.com Mon Jan 14 20:40:18 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 14 Jan 2019 12:40:18 -0800 Subject: Hyphenated keywords and switch expressions In-Reply-To: References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> Message-ID: <5C3CF3B2.20605@oracle.com> Hi Gavin, Some points driven partly by the discussion with Tagir: 1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- there is no indication in JEP 325 that a semicolon is desired after `-> {...}` and javac in JDK 12 does not accept one there. Also, SwitchLabeledThrowStatement should not end with a `;` because ThrowStatement includes a `;`. 2. In 14.11.1, "This block can either be empty, or take one of two forms:" is wrong for switch expressions. The emptiness allowed by the grammar will be banned semantically in 15.29.1, so 14.11.1 should avoid trouble by speaking broadly of the forms in an educational tone: "A switch block can consist of either: - _Switch labeled rules_, which use `->` to introduce either a _switch labeled expression_, ..." Also, "optionally followed by switch labels." is wrong for switch expressions, so prefer: "- _Switch labeled statement groups_, which use `:` to introduce block statements." 3. In 15.29.1: (this is mainly driven by eyeballing against 14.11.2) - Incorrect Markdown in section header. - The error clause in the following bullet is redundant because the list header already called for an error: "The switch block must be compatible with the type of the selector expression, *****or a compile-time error occurs*****." - I would prefer to pull the choice of {default label, enum typed selector expression} into a fourth bullet of the prior list, to align how 14.11.2's list has a bullet concerning default label. - The significant rule from 14.11.2 that "If the switch block consists of switch labeled rules, then any switch labeled expression must be a statement expression (14.8)." has no parallel in 15.29.1. Instead, for switch labeled rules, 15.29.1 has a rule for switch labeled blocks. (1) We haven't seen switch labeled blocks for ages, so a cross-ref to 14.11.1 is due. (2) A note that switch exprs allow `-> ANY_EXPRESSION` while switch statements allow `-> NOT_ANY_EXPRESSION` is due in both sections; grep ch.8 for "In this respect" to see what I mean. (3) The semantic constraints on switch labeled rules+statement groups in 15.29.1 should be easily contrastable with those in 14.11.2 -- one approach is to pull the following constraints into 15.29.1's "all conditions true, or error" list: ----- - If the switch block consists of switch labeled rules, then any switch labeled block (14.11.1) MUST COMPLETE ABRUPTLY. - If the switch block consists of switch labeled statement groups, then the last statement in the switch block MUST COMPLETE ABRUPTLY, and the switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH LABELED STATEMENT GROUP. ----- If you prefer to keep these semantic constraints standalone so that they have negative polarity, then 14.11.2 should to the same for its significant-but-easily-missed "must be a statement expression" constraint. Alex On 1/13/2019 2:53 AM, Tagir Valeev wrote: > Hello! > >> I'm concerned about any claim of ambiguity in the grammar, though I'm >> not sure I'm following you correctly. I agree that your first fragment >> is parsed as two statements -- a switch statement and an empty statement >> -- but I don't know what you mean about "inside switch expression rule" >> for your second fragment. A switch expression is not an expression >> statement (JLS 14.8). In your second fragment, the leftmost default >> label is followed not by a block or a throw statement but by an >> expression (`switch (0) {...}`, a unary expression) and a semicolon. > > Ah, ok, we moved away slightly from the spec draft [1]. I was not > aware, because I haven't wrote parser by myself. The draft says: > > SwitchLabeledRule: > SwitchLabeledExpression > SwitchLabeledBlock > SwitchLabeledThrowStatement > > SwitchLabeledExpression: > SwitchLabel -> Expression ; > SwitchLabeledBlock: > SwitchLabel -> Block ; > SwitchLabeledThrowStatement: > SwitchLabel -> ThrowStatement ; > > (by the way I think that ; after block and throw should not be > present: current implementation does not require it after the block > and throw statement already includes a ; inside it). > > Instead we implement it like: > > SwitchLabeledRule: > SwitchLabel -> SwitchLabeledRuleStatement > SwitchLabeledRuleStatement: > ExpressionStatement > Block > ThrowStatement > > So we assume that the right part of SwitchLabeledRule is always a > statement and reused ExpressionStatement to express Expression plus > semicolon, because syntactically it looks the same. Strictly following > a spec draft here looks even more ugly, because it requires more > object types in our code model and reduces the flexibility when we > need to perform code transformation. E.g. if we want to wrap > expression into block, currently we just need to replace an > ExpressionStatement with a Block not touching a SwitchLabel at all. > Had we mirrored the spec in our code model, we would need to replace > SwitchLabeledExpression with SwitchLabeledBlock which looks more > annoying. > > With best regards, > Tagir Valeev > > [1] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11 > From forax at univ-mlv.fr Mon Jan 14 20:49:19 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 14 Jan 2019 21:49:19 +0100 (CET) Subject: Multiple return values In-Reply-To: References: <1690343428.117805.1547464843466.JavaMail.zimbra@u-pem.fr> Message-ID: <1398192988.258647.1547498959459.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Lukas Eder" , "amber-spec-experts" > Envoy?: Lundi 14 Janvier 2019 17:29:54 > Objet: Re: Multiple return values > I was trying to keep my reply focused on concepts (nominal product types > rather than tuples), rather than ad-hoc syntactic tricks (which have the > risk of blurring the distinction.) That's the point of my previous mail!, blur the distinction between a tuple and non mutable record, a tuple being just syntactic sugar one top of records and pattern matching. > > You're basically proposing several things: > ?- Define a distinguished sub-category of records that have extra features; > ?- For that sub-category, provide an ad-hoc, tuple-like construction > syntax; > ?- For that sub-category, provide an ad-hoc, tuple-like destructuring > syntax. > > I think the first is a mistake, and I don't think we really need either > of the latter two -- and I think having them may well be more > confusing.? If you're returning a Range, then saying > > ??? x -> new Range(x,x+10) > > is saying what you mean, and it's not very painful. The first item is a mistake (re-reading it), but i think that ensuring that a tuple is non mutable is something important, but being a record and a value type is enough. > > Similarly, there's no need to have an ad-hoc destructuring syntax for > records; we _already_ have that, which is pattern matching. Whatever the > syntax of "unconditional bind" is going to be, you'd be able to say > something like > > ??? __unconditional_bind Range(var lo, var hi) = getRange(...) > > with an ordinary destructuring pattern.? No need for an explicit > pseudo-tuple concept, and no need for an ad-hoc tuple-like destructuring > mechanism. yes, i fully agree, it's just syntactic sugar on top of records and pattern matching. That's what's make the proposal great and stupid at the same time. It's great because it's just syntactic sugar and it's stupid because it's just syntactic sugar, and like any syntactic sugars it can make the code less readable because it carries an implicit semantics. > > If, at the end of the game, we decide that the straight denotation of > pattern matching isn't enough, we can revisit.? yes ! > If, at the end of the game, we decide that the straight denotation of > pattern matching isn't enough, we can revisit. But the answer to Lukas' > question is basically: > > - Multiple return is a weak feature, that will feel like tuples > dangled in front of your face, and then snatched back; > - We prefer nominal tuple (records) to structural ones; > - Records will have a compact (usually one line) declaration, a > compact construction syntax, and a compact destructuring syntax. Yes, i've hijack this thread a little, i fully agree with your point 1 and 3. I'm not sure about your point 2, yes, the Java type system is nominal, but sometimes a structural syntax help readability, lambdas is a good example. Anyway, as you said, we can revisit that later. R?mi > > > > On 1/14/2019 6:20 AM, Remi Forax wrote: >> You can have both ! >> This is basically what we are doing with lambdas, you have a structural syntax + >> a named type that are bound together using inference. >> >> Let say we have a tuple keyword that means, value + record + >> constructor/de-constructor >> tuple Range(int lo, int hi) { ? } >> >> then you can write: >> Range method(int x) { >> return (x, x + 1); // the compiler infers "new Range(x, x + 1)" >> } >> >> and also >> var (x, y) = method(); // the compiler uses the de-constructor or the record >> getters if there is no de-constructor >> >> >> With Stream s = aStream.map(Lukas::method), ??? is a Range, >> and if someone want to use the tuple syntax inside a call to map(), a type as to >> be provided, >> by example >> aStream.map(x -> (x, x + 1)).collect(...) >> >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Lukas Eder" >>> Cc: "amber-spec-comments" >>> Envoy?: Vendredi 11 Janvier 2019 17:07:43 >>> Objet: Re: Multiple return values >>> While I understand where you?re coming from, I think multiple return is likely >>> to be both more intrusive and less satisfying than it first appears. >>> >>> First, it?s a relatively deep cut; it goes all the way down to method >>> descriptors, since methods in the JVM can only return a single thing. So what >>> you?re really asking the compiler to do is create an anonymous record (whose >>> denotation must be stable as it will be burned into client classfiles.) That?s >>> the ?more intrusive? part. >>> >>> The ?less satisfying? part is that if you can return multiple values: >>> >>> return (x, y) >>> >>> and then obviously you need a way to destructure multiple values: >>> >>> (x, y) = method() >>> >>> (since otherwise, what would you do with the return value?) >>> >>> But here?s where people will hate you: why can I use tuples as return values, >>> and destructure them into locals, but not use them as method arguments, or type >>> parameters? Now I can?t compose >>> >>> someMethod(method()) >>> >>> because I can?t denote the return type of method() as a parameter type. And I >>> can use your multiple-returning method in a stream map: >>> >>> Stream s = aStream.map(Lukas::method) // stream of what? >>> >>> When we tug on this string, we?ll be very disappointed that it?s not tied to >>> anything. >>> >>> >>> Instead, what you can do is expose records in your APIs: >>> >>> ``` >>> class MyAPI { >>> record Range(int lo, int hi) { ? } >>> >>> Range method() { ? } >>> } >>> ``` >>> >>> and now a caller gets a Range back, which is a denotable type and whose >>> components have descriptive names. >>> >>> You say you don?t want to do this because creating new types is so much work. >>> Is the one-line declaration of `Range` above really so much work? (Ignoring >>> the fact that returning a Range is far more descriptive than returning an (int, >>> int) pair.) >>> >>> >>> >>>> On Jan 11, 2019, at 10:57 AM, Lukas Eder wrote: >>>> >>>> Hello, >>>> >>>> I'm referring to the exciting proposed new features around destructuring >>>> values from records and other types as shown here: >>>> https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#pattern-bind-statements >>>> >>>> The example given was: >>>> >>>> Rect r = ... >>>> __let Rect(var p0, var p1) = r; >>>> // use p0, p1 >>>> >>>> This is a very useful construct, which I have liked using in other >>>> languages a lot. Just today, I had a similar use case where I would have >>>> liked to be able to do something like this, but without declaring a nominal >>>> type Rect. Every now and then, I would like to return more than one value >>>> from a method. For example: >>>> >>>> private X, Y method() { >>>> X x = ... >>>> Y y = ... >>>> return x, y; >>>> } >>>> >>>> I would then call this method as follows (hypothetical syntax. Many other >>>> syntaxes are possible, e.g. syntaxes that make expressions look like >>>> tuples, or actual tuples of course): >>>> >>>> X x, Y y = method(); >>>> >>>> The rationale is that I don't (always) want to: >>>> >>>> - Modify either type X or Y, because this is just one little method where I >>>> want to indicate to the call site of method() in what context they should >>>> interpret X by providing a context Y >>>> - Wrap X and Y in a new type, because creating new types is too much work >>>> - Wrap X and Y in Object[] because that's just dirty >>>> - Rely on escape analysis for some wrapper type (minor requirement for me) >>>> - Assign both X and Y. Something like "X x, _ = method()" or "_, Y y = >>>> method()" would be useful, too. >>>> >>>> I was wondering if in the context of all the work going on in Amber around >>>> capturing local variables, etc. if something like this is reasonably >>>> possible as well in some future Java. >>>> >>>> This is possible in Python. Go uses this syntax to return exceptions. >>>> >>>> Thanks, > >>> Lukas From john.r.rose at oracle.com Tue Jan 15 01:58:46 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 14 Jan 2019 17:58:46 -0800 Subject: Multiple return values In-Reply-To: <1398192988.258647.1547498959459.JavaMail.zimbra@u-pem.fr> References: <1690343428.117805.1547464843466.JavaMail.zimbra@u-pem.fr> <1398192988.258647.1547498959459.JavaMail.zimbra@u-pem.fr> Message-ID: On Jan 14, 2019, at 12:49 PM, forax at univ-mlv.fr wrote: > > yes, i fully agree, it's just syntactic sugar on top of records and pattern matching. > That's what's make the proposal great and stupid at the same time. > It's great because it's just syntactic sugar and it's stupid because it's just syntactic sugar, and like any syntactic sugars it can make the code less readable because it carries an implicit semantics. > >> >> If, at the end of the game, we decide that the straight denotation of >> pattern matching isn't enough, we can revisit. > > yes ! +1 revisit much later I think this is a kind of target-based type inference. It is as if a tuple-like expression (x, y) were a poly expression, whose type depends on the target type. After inference, a pattern or constructor head is supplied, as T(x, y). That's not completely alien to our bag of tricks. But is is way down the road. We can't even evaluate it until we have the regular T(x, y) form of patterns deployed, so we can gather experience using them *without* the extra type inference. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Tue Jan 15 02:40:28 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Tue, 15 Jan 2019 09:40:28 +0700 Subject: Hyphenated keywords and switch expressions In-Reply-To: <5C3CEC5C.9090801@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CEC5C.9090801@oracle.com> Message-ID: Hello! On Tue, Jan 15, 2019 at 3:09 AM Alex Buckley wrote: > That's an odd assumption, because SwitchLabeledRule appears in the > SwitchBlock of a SwitchExpression, and we obviously intend any kind of > expression to be allowed after the -> in a switch expression. That is, > in a switch expression, what comes after the -> is not just an > expression statement (x=y, ++x, --x, x++, x--, x.m(), new X()) but any > expression (including this, X.class, x.f, x[i], x::m). Only in a switch > statement do we restrict the kind of expression allowed after the -> but > that's a semantic rule (14.11.2), not syntactic, in order to share the > grammar between switch expressions and switch statements. My bad, I should have been more explicit here. We are IDE, so dealing with incomplete/erroneous code is our first priority. In our grammar an expression statement is any expression plus semicolon. This allows to build well-formed AST from something like "a+b()%c;". Of course, error highlighter marks this as invalid code when visits such node, but other features work nicely. E.g. you can introduce variable here, inline b() call or have a division-by-zero warning if c happens to be always zero at this point. Also this was expanded to enhanced switches nicely: for switch statement SwitchLabeledExpression we just reuse the same error highlighting code as for expression statements, because semantic rule is the same. The only ugliness is switch expression inside SwitchLabeledExpression which I mentioned in the first message. With best regards, Tagir Valeev. From gavin.bierman at oracle.com Thu Jan 17 09:14:00 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 17 Jan 2019 10:14:00 +0100 Subject: Hyphenated keywords and switch expressions In-Reply-To: <5C3CF3AB.5030602@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CF3AB.5030602@oracle.com> Message-ID: <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> Thank you Alex and Tagir. I have uploaded a new version of the spec at: http://cr.openjdk.java.net/~gbierman/switch-expressions.html This contains all the changes you suggested below. In addition, there is a small bug fix in 5.6.3 concerning widening (https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the opportunity to reorder chapter 15 slightly, so switch expressions are now section 15.28 and constant expressions are now section 15.29 (the last section in the chapter). Comments welcome! Gavin > On 14 Jan 2019, at 21:40, Alex Buckley wrote: > > Hi Gavin, > > Some points driven partly by the discussion with Tagir: > > 1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- there is no indication in JEP 325 that a semicolon is desired after `-> {...}` and javac in JDK 12 does not accept one there. Also, SwitchLabeledThrowStatement should not end with a `;` because ThrowStatement includes a `;`. > > 2. In 14.11.1, "This block can either be empty, or take one of two forms:" is wrong for switch expressions. The emptiness allowed by the grammar will be banned semantically in 15.29.1, so 14.11.1 should avoid trouble by speaking broadly of the forms in an educational tone: "A switch block can consist of either: - _Switch labeled rules_, which use `->` to introduce either a _switch labeled expression_, ..." Also, "optionally followed by switch labels." is wrong for switch expressions, so prefer: "- _Switch labeled statement groups_, which use `:` to introduce block statements." > > 3. In 15.29.1: (this is mainly driven by eyeballing against 14.11.2) > > - Incorrect Markdown in section header. > > - The error clause in the following bullet is redundant because the list header already called for an error: "The switch block must be compatible with the type of the selector expression, *****or a compile-time error occurs*****." > > - I would prefer to pull the choice of {default label, enum typed selector expression} into a fourth bullet of the prior list, to align how 14.11.2's list has a bullet concerning default label. > > - The significant rule from 14.11.2 that "If the switch block consists of switch labeled rules, then any switch labeled expression must be a statement expression (14.8)." has no parallel in 15.29.1. Instead, for switch labeled rules, 15.29.1 has a rule for switch labeled blocks. (1) We haven't seen switch labeled blocks for ages, so a cross-ref to 14.11.1 is due. (2) A note that switch exprs allow `-> ANY_EXPRESSION` while switch statements allow `-> NOT_ANY_EXPRESSION` is due in both sections; grep ch.8 for "In this respect" to see what I mean. (3) The semantic constraints on switch labeled rules+statement groups in 15.29.1 should be easily contrastable with those in 14.11.2 -- one approach is to pull the following constraints into 15.29.1's "all conditions true, or error" list: > > ----- > - If the switch block consists of switch labeled rules, then any switch labeled block (14.11.1) MUST COMPLETE ABRUPTLY. > - If the switch block consists of switch labeled statement groups, then the last statement in the switch block MUST COMPLETE ABRUPTLY, and the switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH LABELED STATEMENT GROUP. > ----- > > If you prefer to keep these semantic constraints standalone so that they have negative polarity, then 14.11.2 should to the same for its significant-but-easily-missed "must be a statement expression" constraint. > > Alex > > On 1/13/2019 2:53 AM, Tagir Valeev wrote: >> Hello! >> >>> I'm concerned about any claim of ambiguity in the grammar, though I'm >>> not sure I'm following you correctly. I agree that your first fragment >>> is parsed as two statements -- a switch statement and an empty statement >>> -- but I don't know what you mean about "inside switch expression rule" >>> for your second fragment. A switch expression is not an expression >>> statement (JLS 14.8). In your second fragment, the leftmost default >>> label is followed not by a block or a throw statement but by an >>> expression (`switch (0) {...}`, a unary expression) and a semicolon. >> >> Ah, ok, we moved away slightly from the spec draft [1]. I was not >> aware, because I haven't wrote parser by myself. The draft says: >> >> SwitchLabeledRule: >> SwitchLabeledExpression >> SwitchLabeledBlock >> SwitchLabeledThrowStatement >> >> SwitchLabeledExpression: >> SwitchLabel -> Expression ; >> SwitchLabeledBlock: >> SwitchLabel -> Block ; >> SwitchLabeledThrowStatement: >> SwitchLabel -> ThrowStatement ; >> >> (by the way I think that ; after block and throw should not be >> present: current implementation does not require it after the block >> and throw statement already includes a ; inside it). >> >> Instead we implement it like: >> >> SwitchLabeledRule: >> SwitchLabel -> SwitchLabeledRuleStatement >> SwitchLabeledRuleStatement: >> ExpressionStatement >> Block >> ThrowStatement >> >> So we assume that the right part of SwitchLabeledRule is always a >> statement and reused ExpressionStatement to express Expression plus >> semicolon, because syntactically it looks the same. Strictly following >> a spec draft here looks even more ugly, because it requires more >> object types in our code model and reduces the flexibility when we >> need to perform code transformation. E.g. if we want to wrap >> expression into block, currently we just need to replace an >> ExpressionStatement with a Block not touching a SwitchLabel at all. >> Had we mirrored the spec in our code model, we would need to replace >> SwitchLabeledExpression with SwitchLabeledBlock which looks more >> annoying. >> >> With best regards, >> Tagir Valeev >> >> [1] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11 >> From forax at univ-mlv.fr Thu Jan 17 09:45:24 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 17 Jan 2019 10:45:24 +0100 (CET) Subject: We need more keywords, captain! In-Reply-To: <86DCF84D-5C37-47C5-91BA-93385206FD49@oracle.com> References: <86DCF84D-5C37-47C5-91BA-93385206FD49@oracle.com> Message-ID: <5961639.901747.1547718324920.JavaMail.zimbra@u-pem.fr> I think i prefer break-with, the problem of break-return is that people will write it break return without the hyphen, break return is in my opinion too close to return if you read the code too fast and a break return without a value means nothing unlike a regular return. I like break-with because it's obvious that you have to say with what value you want to break, which is exactly the issue we have with the current break syntax. So i vote for break-with instead of break, as Brian said, the expression switch is currently a preview feature of 12 so we can still tweak the syntax a bit. R?mi ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 8 Janvier 2019 18:23:36 > Objet: Re: We need more keywords, captain! > Actually, even better than `break-with` would be `break-return`. It?s clearly a > kind of `break`, and also clearly a kind of `return`. > > I think maybe this application alone has won me over to the idea of hyphenated > keywords. > > (Then again, for this specific application we don?t even need the hyphen; we > could just write `break return v;`.) > > ?Guy > >> On Jan 8, 2019, at 12:35 PM, Brian Goetz wrote: >> >> When discussing this today at our compiler meeting, we realized a few more >> places where the lack of keywords produce distortions we don't even notice. In >> expression switch, we settled on `break value` as the way to provide a value >> for a switch expression when the shorthand (`case L -> e`) doesn't suffice, but >> this was painful for everyone. It's painful for users because there's now work >> required to disambiguate whether `break foo` is a labeled break or a value >> break; it was even more painful to specify, because a new form of abrupt >> completion had to be threaded through the spec. >> >> Being able to call this something like `break-with v` (or some other derived >> keyword) would have made this all a lot simpler. (BTW, we can still do this, >> since expression-switch is still in preview.) >> >> Moral of the story: even just a few minutes of brainstorming led us to several >> applications of this approach that we hadn't seen a few days ago. >> >> On 1/8/2019 10:22 AM, Brian Goetz wrote: >>> This document proposes a possible move that will buy us some breathing room in >>> the perpetual problem where the keyword-management tail wags the >>> programming-model dog. >>> >>> >>> ## We need more keywords, captain! >>> >>> Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to >>> be used as identifiers. This set has remained quite stable over the >>> years (for good reason), with the exceptions of `assert` added in 1.4, >>> `enum` added in 5, and `_` added in 9. In addition, there are also >>> several _reserved identifiers_ (`true`, `false`, and `null`) which >>> behave almost like keywords. >>> >>> Over time, as the language evolves, language designers face a >>> challenge; the set of keywords imagined in version 1.0 are rarely >>> suitable for expressing all the things we might ever want our language >>> to express. We have several tools at our disposal for addressing this >>> problem: >>> >>> - Eminent domain. Take words that were previously identifiers, and >>> turn them into keywords, as we did with `assert` in 1.4. >>> >>> - Recycle. Repurpose an existing keyword for something that it was >>> never really meant for (such as using `default` for annotation >>> values or default methods). >>> >>> - Do without. Find a way to pick a syntax that doesn't require a >>> new keyword, such as using `@interface` for annotations instead of >>> `annotation` -- or don't do the feature at all. >>> >>> - Smoke and mirrors. Create the illusion of context-dependent >>> keywords through various linguistic heroics (restricted keywords, >>> reserved type names.) >>> >>> In any given situation, all of these options are on the table -- but >>> most of the time, none of these options are very good. The lack of >>> reasonable options for extending the syntax of the language threatens >>> to become a significant impediment to language evolution. >>> >>> #### Why not "just" make new keywords? >>> >>> While it may be legal for us to declare `i` to be a keyword in a >>> future version of Java, this would likely break every program in the >>> world, since `i` is used so commonly as an identifier. (When the >>> `assert` keyword was added in 1.4, it broke every testing framework.) >>> The cost of remediating the effect of such incompatible changes varies >>> as well; invalidating a name choice for a local variable has a local >>> fix, but invalidating the name of a public type or an interface >>> method might well be fatal. >>> >>> Additionally, the keywords we're likely to want to reclaim are often >>> those that are popular as identifiers (e.g., `value`, `var`, >>> `method`), making such fatal collisions more likely. In some cases, >>> if the keyword candidate in question is sufficiently rarely used as an >>> identifier, we might still opt to take that source-compatibility hit >>> -- but names that are less likely to collide (e.g., >>> `usually_but_not_always_final`) are likely not the ones we want in our >>> language. Realistically, this is unlikely to be a well we can go to >>> very often, and the bar must be very high. >>> >>> #### Why not "just" live with the keywords we have? >>> >>> Reusing keywords in multiple contexts has ample precedent in >>> programming languages, including Java. (For example, we (ab)use `final` >>> for "not mutable", "not overridable", and "not extensible".) >>> Sometimes, using an existing keyword in a new context is natural and >>> sensible, but usually it's not our first choice. Over time, as the >>> range of demands we place on our keyword set expands, this may well >>> descend into the ridiculous; no one wants to use `null final` as a way >>> of negating finality. (While one might think such things are too >>> ridiculous to consider, note that we received serious-seeming >>> suggestions during JEP 325 to use `new switch` to describe a switch >>> with different semantics. Presumably to be followed by `new new >>> switch` in ten years.) >>> >>> Of course, one way to live without making new keywords is to stop >>> evolving the language entirely. While there are some who think this >>> is a fine idea, doing so because of the lack of available tokens would >>> be a silly reason. We are convinced that Java has a long life ahead of >>> it, and developers are excited about new features that enable to them >>> to write more expressive and reliable code. >>> >>> #### Why not "just" make contextual keywords? >>> >>> At first glance, contextual keywords (and their friends, such as >>> reserved type identifiers) may appear to be a magic wand; they let us >>> create the illusion of adding new keywords without breaking existing >>> programs. But the positive track record of contextual keywords hides >>> a great deal of complexity and distortion. >>> >>> Each grammar position is its own story; contextual keywords that might >>> be used as modifiers (e.g., `readonly`) have different ambiguity >>> considerations than those that might be use in code (e.g., a `matches` >>> expression). The process of selecting a contextual keyword is not a >>> simple matter of adding it to the grammar; each one requires an >>> analysis of potential current and future interactions. Similarly, >>> each token we try to repurpose may have its own special >>> considerations; for example, we could justify the use of `var` as a >>> reserved type name because because the naming conventions are so >>> broadly adhered to. Finally, the use of contextual keywords in >>> certain syntactic positions can create additional considerations for >>> extending the syntax later. >>> >>> Contextual keywords create complexity for specifications, compilers, >>> and IDEs. With one or two special cases, we can often deal well >>> enough, but if special cases were to become more pervasive, this would >>> likely result in more significant maintenance costs or bug tail. While >>> it is easy to dismiss this as ?not my problem?, in reality, this is >>> everybody?s problem. IDEs often have to guess whether a use of a >>> contextual keyword is a keyword or identifier, and it may not have >>> enough information to make a good guess until it?s seen more input. >>> This results in worse user highlighting, auto-completion, and >>> refactoring abilities ? or worse. These problems quickly become >>> everyone's problems. >>> >>> So, while contextual keywords are one of the tools in our toolbox, >>> they should also be used sparingly. >>> >>> #### Why is this a problem? >>> >>> Aside from the obvious consequences of these problems (clunky syntax, >>> complexity, bugs), there is a more insidious hidden cost -- >>> distortion. The accidental details of keyword management pose a >>> constant risk of distortion in language design. >>> >>> One could consider the choice to use `@interface` instead of >>> `annotation` for annotations to be a distortion; having a descriptive >>> name rather than a funky combination of punctuation and keyword would >>> surely have made it easier for people to become familiar with >>> annotations. >>> >>> In another example, the set of modifiers (`public`, `private`, >>> `static`, `final`, etc) is not complete; there is no way to say ?not >>> final? or ?not static?. This, in turn, means that we cannot create >>> features where variables or classes are `final` by default, or members >>> are `static` by default, because there?s no way to denote the desire >>> to opt out of it. While there may be reasons to justify a locally >>> suboptimal default anyway (such as global consistency), we want to >>> make these choices deliberately, not have them made for us by the >>> accidental details of keyword management. Choosing to leave out a >>> feature for reasons of simplicity is fine; leaving it out because we >>> don't have a way to denote the obvious semantics is not. >>> >>> It may not be obvious from the outside, but this is a constant problem >>> in evolving the language, and an ongoing tax that we all pay, directly >>> or indirectly. >>> >>> ## We need a new source of keyword candidates >>> >>> Every time we confront this problem, the overwhelming tendency is to >>> punt and pick one of the bad options, because the problem only comes >>> along every once in a while. But, with the features in the pipeline, I >>> expect it will continue to come along with some frequency, and I?d >>> rather get ahead of it. Given that all of these current options are >>> problematic, and there is not even a least-problematic move that >>> applies across all situations, my inclination is to try to expand the >>> set of lexical forms that can be used as keywords. >>> >>> As a not-serious example, take the convention that we?ve used for >>> experimental features, where we prefix provisional keywords in >>> prototypes with two underscores, as we did with `__ByValue` in the >>> Valhalla prototype. (We commonly do this in feature proposals and >>> prototypes, mostly to signify ?this keyword is a placeholder for a >>> syntax decision to be made later?, but also because it permits a >>> simple implementation that is unlikely to collide with existing code.) >>> We could, for example, carve out the space of identifiers that begin >>> with underscore as being reserved for keywords. Of course, this isn?t >>> so pretty, and it also means we'd have a mix of underscore and >>> non-underscore keywords, so it?s not a serious suggestion, as much as >>> an example of the sort of move we are looking for. >>> >>> But I do have a serious suggestion: allow _hyphenated_ keywords where >>> one or more of the terms are already keywords or reserved identifiers. >>> Unlike restricted keywords, this creates much less trouble for >>> parsing, as (for example) `non-null` cannot be confused for a >>> subtraction expression, and the lexer can always tell with fixed >>> lookahead whether `a-b` is three tokens or one. This gives us a lot >>> more room for creating new, less-conflicting keywords. And these new >>> keywords are likely to be good names, too, as many of the missing >>> concepts we want to add describe their relationship to existing >>> language constructs -- such as `non-null`. >>> >>> Here?s some examples where this approach might yield credible >>> candidates. (Note: none of these are being proposed here; this is >>> merely an illustrative list of examples of how this mechanism could >>> form keywords that might, in some particular possible future, be >>> useful and better than the alternatives we have now.) >>> >>> - `non-null` >>> - `non-final` >>> - `package-private` (the default accessibility for class members, currently not >>> denotable) >>> - `public-read` (publicly readable, privately writable) >>> - `null-checked` >>> - `type-static` (a concept needed in Valhalla, which is static relative to a >>> particular specialization of a class, rather than the class itself) >>> - `default-value` >>> - `eventually-final` (what the `@Stable` annotation currently suggests) >>> - `semi-final` (an alternative to `sealed`) >>> - `exhaustive-switch` (opting into exhaustiveness checking for statement >>> switches) >>> - `enum-class`, `annotation-class`, `record-class` (we might have chosen these >>> as an alternative to `enum` and `@interface`, had we had the option) >>> - `this-class` (to describe the class literal for the current class) >>> - `this-return` (a common request is a way to mark a setter or builder method >>> as returning its receiver) >>> >>> (Again, the point is not to debate the merits of any of these specific >>> examples; the point is merely to illustrate what we might be able to do >>> with such a mechanism.) >>> >>> Having this as an option doesn't mean we can't also use the other >>> approaches when they are suitable; it just means we have more, and >>> likely less fraught, options with which to make better decisions. >>> >>> There are likely to be other lexical schemes by which new keywords can >>> be created without impinging on existing code; this one seems credible >>> and reasonably parsable by both machines and humans. >>> >>> #### "But that's ugly" >>> >>> Invariably, some percentage of readers will have an immediate and >>> visceral reaction to this idea. Let's stipulate for the record that >>> some people will find this ugly. (At least, at first. Many such >>> reactions are possibly-transient (see what I did there?) responses >>> to unfamiliarity.) >>> >>> From amaembo at gmail.com Thu Jan 17 10:12:08 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 17 Jan 2019 17:12:08 +0700 Subject: We need more keywords, captain! In-Reply-To: <5961639.901747.1547718324920.JavaMail.zimbra@u-pem.fr> References: <86DCF84D-5C37-47C5-91BA-93385206FD49@oracle.com> <5961639.901747.1547718324920.JavaMail.zimbra@u-pem.fr> Message-ID: +1 to break-with With best regards, Tagir Valeev ??, 17 ???. 2019 ?., 16:46 Remi Forax forax at univ-mlv.fr: > I think i prefer break-with, > the problem of break-return is that people will write it break return > without the hyphen, break return is in my opinion too close to return if > you read the code too fast and a break return without a value means nothing > unlike a regular return. > > I like break-with because it's obvious that you have to say with what > value you want to break, which is exactly the issue we have with the > current break syntax. > > So i vote for break-with instead of break, > as Brian said, the expression switch is currently a preview feature of 12 > so we can still tweak the syntax a bit. > > R?mi > > ----- Mail original ----- > > De: "Guy Steele" > > ?: "Brian Goetz" > > Cc: "amber-spec-experts" > > Envoy?: Mardi 8 Janvier 2019 18:23:36 > > Objet: Re: We need more keywords, captain! > > > Actually, even better than `break-with` would be `break-return`. It?s > clearly a > > kind of `break`, and also clearly a kind of `return`. > > > > I think maybe this application alone has won me over to the idea of > hyphenated > > keywords. > > > > (Then again, for this specific application we don?t even need the > hyphen; we > > could just write `break return v;`.) > > > > ?Guy > > > >> On Jan 8, 2019, at 12:35 PM, Brian Goetz > wrote: > >> > >> When discussing this today at our compiler meeting, we realized a few > more > >> places where the lack of keywords produce distortions we don't even > notice. In > >> expression switch, we settled on `break value` as the way to provide a > value > >> for a switch expression when the shorthand (`case L -> e`) doesn't > suffice, but > >> this was painful for everyone. It's painful for users because there's > now work > >> required to disambiguate whether `break foo` is a labeled break or a > value > >> break; it was even more painful to specify, because a new form of abrupt > >> completion had to be threaded through the spec. > >> > >> Being able to call this something like `break-with v` (or some other > derived > >> keyword) would have made this all a lot simpler. (BTW, we can still do > this, > >> since expression-switch is still in preview.) > >> > >> Moral of the story: even just a few minutes of brainstorming led us to > several > >> applications of this approach that we hadn't seen a few days ago. > >> > >> On 1/8/2019 10:22 AM, Brian Goetz wrote: > >>> This document proposes a possible move that will buy us some breathing > room in > >>> the perpetual problem where the keyword-management tail wags the > >>> programming-model dog. > >>> > >>> > >>> ## We need more keywords, captain! > >>> > >>> Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to > >>> be used as identifiers. This set has remained quite stable over the > >>> years (for good reason), with the exceptions of `assert` added in 1.4, > >>> `enum` added in 5, and `_` added in 9. In addition, there are also > >>> several _reserved identifiers_ (`true`, `false`, and `null`) which > >>> behave almost like keywords. > >>> > >>> Over time, as the language evolves, language designers face a > >>> challenge; the set of keywords imagined in version 1.0 are rarely > >>> suitable for expressing all the things we might ever want our language > >>> to express. We have several tools at our disposal for addressing this > >>> problem: > >>> > >>> - Eminent domain. Take words that were previously identifiers, and > >>> turn them into keywords, as we did with `assert` in 1.4. > >>> > >>> - Recycle. Repurpose an existing keyword for something that it was > >>> never really meant for (such as using `default` for annotation > >>> values or default methods). > >>> > >>> - Do without. Find a way to pick a syntax that doesn't require a > >>> new keyword, such as using `@interface` for annotations instead of > >>> `annotation` -- or don't do the feature at all. > >>> > >>> - Smoke and mirrors. Create the illusion of context-dependent > >>> keywords through various linguistic heroics (restricted keywords, > >>> reserved type names.) > >>> > >>> In any given situation, all of these options are on the table -- but > >>> most of the time, none of these options are very good. The lack of > >>> reasonable options for extending the syntax of the language threatens > >>> to become a significant impediment to language evolution. > >>> > >>> #### Why not "just" make new keywords? > >>> > >>> While it may be legal for us to declare `i` to be a keyword in a > >>> future version of Java, this would likely break every program in the > >>> world, since `i` is used so commonly as an identifier. (When the > >>> `assert` keyword was added in 1.4, it broke every testing framework.) > >>> The cost of remediating the effect of such incompatible changes varies > >>> as well; invalidating a name choice for a local variable has a local > >>> fix, but invalidating the name of a public type or an interface > >>> method might well be fatal. > >>> > >>> Additionally, the keywords we're likely to want to reclaim are often > >>> those that are popular as identifiers (e.g., `value`, `var`, > >>> `method`), making such fatal collisions more likely. In some cases, > >>> if the keyword candidate in question is sufficiently rarely used as an > >>> identifier, we might still opt to take that source-compatibility hit > >>> -- but names that are less likely to collide (e.g., > >>> `usually_but_not_always_final`) are likely not the ones we want in our > >>> language. Realistically, this is unlikely to be a well we can go to > >>> very often, and the bar must be very high. > >>> > >>> #### Why not "just" live with the keywords we have? > >>> > >>> Reusing keywords in multiple contexts has ample precedent in > >>> programming languages, including Java. (For example, we (ab)use > `final` > >>> for "not mutable", "not overridable", and "not extensible".) > >>> Sometimes, using an existing keyword in a new context is natural and > >>> sensible, but usually it's not our first choice. Over time, as the > >>> range of demands we place on our keyword set expands, this may well > >>> descend into the ridiculous; no one wants to use `null final` as a way > >>> of negating finality. (While one might think such things are too > >>> ridiculous to consider, note that we received serious-seeming > >>> suggestions during JEP 325 to use `new switch` to describe a switch > >>> with different semantics. Presumably to be followed by `new new > >>> switch` in ten years.) > >>> > >>> Of course, one way to live without making new keywords is to stop > >>> evolving the language entirely. While there are some who think this > >>> is a fine idea, doing so because of the lack of available tokens would > >>> be a silly reason. We are convinced that Java has a long life ahead of > >>> it, and developers are excited about new features that enable to them > >>> to write more expressive and reliable code. > >>> > >>> #### Why not "just" make contextual keywords? > >>> > >>> At first glance, contextual keywords (and their friends, such as > >>> reserved type identifiers) may appear to be a magic wand; they let us > >>> create the illusion of adding new keywords without breaking existing > >>> programs. But the positive track record of contextual keywords hides > >>> a great deal of complexity and distortion. > >>> > >>> Each grammar position is its own story; contextual keywords that might > >>> be used as modifiers (e.g., `readonly`) have different ambiguity > >>> considerations than those that might be use in code (e.g., a `matches` > >>> expression). The process of selecting a contextual keyword is not a > >>> simple matter of adding it to the grammar; each one requires an > >>> analysis of potential current and future interactions. Similarly, > >>> each token we try to repurpose may have its own special > >>> considerations; for example, we could justify the use of `var` as a > >>> reserved type name because because the naming conventions are so > >>> broadly adhered to. Finally, the use of contextual keywords in > >>> certain syntactic positions can create additional considerations for > >>> extending the syntax later. > >>> > >>> Contextual keywords create complexity for specifications, compilers, > >>> and IDEs. With one or two special cases, we can often deal well > >>> enough, but if special cases were to become more pervasive, this would > >>> likely result in more significant maintenance costs or bug tail. While > >>> it is easy to dismiss this as ?not my problem?, in reality, this is > >>> everybody?s problem. IDEs often have to guess whether a use of a > >>> contextual keyword is a keyword or identifier, and it may not have > >>> enough information to make a good guess until it?s seen more input. > >>> This results in worse user highlighting, auto-completion, and > >>> refactoring abilities ? or worse. These problems quickly become > >>> everyone's problems. > >>> > >>> So, while contextual keywords are one of the tools in our toolbox, > >>> they should also be used sparingly. > >>> > >>> #### Why is this a problem? > >>> > >>> Aside from the obvious consequences of these problems (clunky syntax, > >>> complexity, bugs), there is a more insidious hidden cost -- > >>> distortion. The accidental details of keyword management pose a > >>> constant risk of distortion in language design. > >>> > >>> One could consider the choice to use `@interface` instead of > >>> `annotation` for annotations to be a distortion; having a descriptive > >>> name rather than a funky combination of punctuation and keyword would > >>> surely have made it easier for people to become familiar with > >>> annotations. > >>> > >>> In another example, the set of modifiers (`public`, `private`, > >>> `static`, `final`, etc) is not complete; there is no way to say ?not > >>> final? or ?not static?. This, in turn, means that we cannot create > >>> features where variables or classes are `final` by default, or members > >>> are `static` by default, because there?s no way to denote the desire > >>> to opt out of it. While there may be reasons to justify a locally > >>> suboptimal default anyway (such as global consistency), we want to > >>> make these choices deliberately, not have them made for us by the > >>> accidental details of keyword management. Choosing to leave out a > >>> feature for reasons of simplicity is fine; leaving it out because we > >>> don't have a way to denote the obvious semantics is not. > >>> > >>> It may not be obvious from the outside, but this is a constant problem > >>> in evolving the language, and an ongoing tax that we all pay, directly > >>> or indirectly. > >>> > >>> ## We need a new source of keyword candidates > >>> > >>> Every time we confront this problem, the overwhelming tendency is to > >>> punt and pick one of the bad options, because the problem only comes > >>> along every once in a while. But, with the features in the pipeline, I > >>> expect it will continue to come along with some frequency, and I?d > >>> rather get ahead of it. Given that all of these current options are > >>> problematic, and there is not even a least-problematic move that > >>> applies across all situations, my inclination is to try to expand the > >>> set of lexical forms that can be used as keywords. > >>> > >>> As a not-serious example, take the convention that we?ve used for > >>> experimental features, where we prefix provisional keywords in > >>> prototypes with two underscores, as we did with `__ByValue` in the > >>> Valhalla prototype. (We commonly do this in feature proposals and > >>> prototypes, mostly to signify ?this keyword is a placeholder for a > >>> syntax decision to be made later?, but also because it permits a > >>> simple implementation that is unlikely to collide with existing code.) > >>> We could, for example, carve out the space of identifiers that begin > >>> with underscore as being reserved for keywords. Of course, this isn?t > >>> so pretty, and it also means we'd have a mix of underscore and > >>> non-underscore keywords, so it?s not a serious suggestion, as much as > >>> an example of the sort of move we are looking for. > >>> > >>> But I do have a serious suggestion: allow _hyphenated_ keywords where > >>> one or more of the terms are already keywords or reserved identifiers. > >>> Unlike restricted keywords, this creates much less trouble for > >>> parsing, as (for example) `non-null` cannot be confused for a > >>> subtraction expression, and the lexer can always tell with fixed > >>> lookahead whether `a-b` is three tokens or one. This gives us a lot > >>> more room for creating new, less-conflicting keywords. And these new > >>> keywords are likely to be good names, too, as many of the missing > >>> concepts we want to add describe their relationship to existing > >>> language constructs -- such as `non-null`. > >>> > >>> Here?s some examples where this approach might yield credible > >>> candidates. (Note: none of these are being proposed here; this is > >>> merely an illustrative list of examples of how this mechanism could > >>> form keywords that might, in some particular possible future, be > >>> useful and better than the alternatives we have now.) > >>> > >>> - `non-null` > >>> - `non-final` > >>> - `package-private` (the default accessibility for class members, > currently not > >>> denotable) > >>> - `public-read` (publicly readable, privately writable) > >>> - `null-checked` > >>> - `type-static` (a concept needed in Valhalla, which is static > relative to a > >>> particular specialization of a class, rather than the class itself) > >>> - `default-value` > >>> - `eventually-final` (what the `@Stable` annotation currently > suggests) > >>> - `semi-final` (an alternative to `sealed`) > >>> - `exhaustive-switch` (opting into exhaustiveness checking for > statement > >>> switches) > >>> - `enum-class`, `annotation-class`, `record-class` (we might have > chosen these > >>> as an alternative to `enum` and `@interface`, had we had the > option) > >>> - `this-class` (to describe the class literal for the current class) > >>> - `this-return` (a common request is a way to mark a setter or > builder method > >>> as returning its receiver) > >>> > >>> (Again, the point is not to debate the merits of any of these specific > >>> examples; the point is merely to illustrate what we might be able to do > >>> with such a mechanism.) > >>> > >>> Having this as an option doesn't mean we can't also use the other > >>> approaches when they are suitable; it just means we have more, and > >>> likely less fraught, options with which to make better decisions. > >>> > >>> There are likely to be other lexical schemes by which new keywords can > >>> be created without impinging on existing code; this one seems credible > >>> and reasonably parsable by both machines and humans. > >>> > >>> #### "But that's ugly" > >>> > >>> Invariably, some percentage of readers will have an immediate and > >>> visceral reaction to this idea. Let's stipulate for the record that > >>> some people will find this ugly. (At least, at first. Many such > >>> reactions are possibly-transient (see what I did there?) responses > >>> to unfamiliarity.) > >>> > >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 17 10:32:50 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 17 Jan 2019 11:32:50 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> Message-ID: <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> I'm still not 100% sure that mixing the exhaustiveness and the closeness is a good idea, again because - you may want closeness of non user named types - you may want exhaustiveness not only types (on values by example) but it makes the feature simple, so let's go that way. Allowing public auxillary subtype of a primary sealed type is the sweet spot for me, better than trying to introduce either a nesting which is not exactly nesting or a rule than only works for pattern matching. I don't understand how "semi-final" can be a good keyword, the name is too vague. Given that the proposal introduce the notion of sealed types, "sealed" is a better keyword. For un-sealing a subtype, "unsealed" seems to be a good keyword. R?mi > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mercredi 9 Janvier 2019 19:44:12 > Objet: Sealed types -- updated proposal > Here's an update on the sealed type proposal based on recent discussions. > Definition. A sealed type is one for which subclassing is restricted according > to guidance specified with the type?s declaration; finality can be considered a > degenerate form of sealing, where no subclasses at all are permitted. Sealed > types are a sensible means of modeling algebraic sum types in a nominal type > hierarchy; they go nicely with records ( algebraic product types ), though are > also useful on their own. > Sealing serves two distinct purposes. The first, and more obvious, is that it > restricts who can be a subtype. This is largely a declaration-site concern, > where an API owner wants to defend the integrity of their API. The other is > that it potentially enables exhaustiveness analysis at the use site when > switching over sealed types (and possibly other features.) This is less > obvious, and the benefit is contingent on some other things, but is valuable as > it enables better compile-time type checking. > Declaration. We specify that a class is sealed by applying the semi-final > modifier to a class, abstract class, or interface: > semi-final interface Node { ... } > In this streamlined form, Node may be extended only by named classes declared in > the same nest. This may be suitable for many situations, but not for all; in > this case, the user may specify an explicit permits list: > semi-final interface Node > permits FooNode, BarNode { ... } > Note: permits here is a contextual keyword. > The two forms may not be combined; if there is a permits list, it must list all > the permitted subtypes. We can think of the simple form as merely inferring the > permits clause from information in the nest. > Exhaustiveness. One of the benefits of sealing is that the compiler can > enumerate the permitted subtypes of a sealed type; this in turn lets us perform > exhaustiveness analysis when switching over patterns involving sealed types. > Permitted subtypes must belong to the same module (or, if not in a module, the > same package.) > Note: It is superficially tempting to have a relaxed but less explicit form, say > which allows for a type to be extended by package-mates or module-mates without > listing them all. However, this would undermine the compiler?s ability to > reason about exhaustiveness. This would achieve the desired subclassing > restrictions, but not the desired ability to reason about exhaustiveness. > Classfile. In the classfile, a sealed type is identified with an ACC_FINAL > modifier, and a PermittedSubtypes attribute which contains a list of permitted > subtypes (similar in structure to the nestmate attributes.) > Transitivity. Sealing is transitive; unless otherwise specified, an abstract > subtype of a sealed type is implicitly sealed (permits list to be inferred), > and a concrete subtype of a sealed type is implicitly final. This can be > reversed by explicitly modifying the subtype with the non-final modifier. > Unsealing a subtype in a hierarchy doesn?t undermine the sealing, because the > (possibly inferred) set of explicitly permitted subtypes still constitutes a > total covering. However, users who know about unsealed subtypes can use this > information to their benefit (much like we do with exceptions today; you can > catch FileNotFoundException separately from IOException if you want, but don?t > have to.) > Note: Scala made the opposite choice with respect to transitivity, requiring > sealing to be opted into at all levels. This is widely believed to be a source > of bugs; it is rare that one actually wants a subtype of a sealed type to not > be sealed. I suspect the reasoning in Scala was, at least partially, the desire > to not make up a new keyword for ?not sealed?. This is understandable, but I?d > rather not add to the list of ?things for which Java got the defaults wrong.? > An example of where explicit unsealing (and private subtypes) is useful can be > found in the JEP-334 API: > semi-final interface ConstantDesc > permits String, Integer, Float, Long, Double, > ClassDesc, MethodTypeDesc, MethodHandleDesc, > DynamicConstantDesc { } > semi-final interface ClassDesc extends ConstantDesc > permits PrimitiveClassDescImpl, ReferenceClassDescImpl { } > private class PrimitiveClassDescImpl implements ClassDesc { } > private class ReferenceClassDescImpl implements ClassDesc { } > semi-final interface MethodTypeDesc extends ConstantDesc > permits MethodTypeDescImpl { } > semi-final interface MethodHandleDesc extends ConstantDesc > permits DirectMethodHandleDesc, MethodHandleDescImpl { } > semi-final interface DirectMethodHandleDesc extends MethodHandleDesc > permits DirectMethodHandleDescImpl > // designed for subclassing > non-final class DynamicConstantDesc extends ConstantDesc { ... } > Enforcement. Both the compiler and JVM should enforce sealing. > Accessibility. Subtypes need not be as accessible as the sealed parent. In this > case, not all clients will get the chance to exhaustively switch over them; > they?ll have to make these switches exhaustive with a default clause or other > total pattern. When compiling a switch over such a sealed type, the compiler > can provide a useful error message (?I know this is a sealed type, but I can?t > provide full exhaustiveness checking here because you can?t see all the > subtypes, so you still need a default.?) > Javadoc. The list of permitted subtypes should probably be considered part of > the spec, and incorporated into the Javadoc. Note that this is not exactly the > same as the current ?All implementing classes? list that Javadoc currently > includes, so a list like ?All permitted subtypes? might be added (possibly with > some indication if the subtype is less accessible than the parent.) > Auxilliary subtypes. With the advent of records, which allow us to define > classes in a single line, the ?one class per file? rule starts to seem both a > little silly, and constrain the user?s ability to put related definitions > together (which may be more readable) while exporting a flat namespace in the > public API. > One way to do get there would be to relax the ?no public auxilliary classes? > rule to permit for sealed classes, say: allowing public auxilliary subtypes of > the primary type, if the primary type is public and sealed. > Another would be to borrow a trick from enums; for a sealed type with nested > subtypes, when you import the sealed type, you implicitly import the nested > subtypes too. That way you could declare: > semi-final interface Node { > class A implements Node { } > class B implements Node { } > } > ?but clients could import Node and then refer to A and B directly: > switch (node) { > case A(): ... > case B(): ... > } > We do something similar for enum constants today. > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 17 10:56:55 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 17 Jan 2019 11:56:55 +0100 (CET) Subject: We need more keywords, captain! In-Reply-To: References: Message-ID: <1456434200.926954.1547722615953.JavaMail.zimbra@u-pem.fr> My favorite hyphen keyword is short-circuit, i don't know where to use it, but it's so good that we have to find a new feature to introduce it :) As i said, i really like this proposal. The hyphen keywords nicely solve the issue when you want to introduce a keyword in the middle of the code, at a place where an identifier may occur. For a keyword at a declaration site (class, field, method), we can use either a contextual keyword or a hyphen keyword. (other comments inlined) ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mardi 8 Janvier 2019 16:22:17 > Objet: We need more keywords, captain! > This document proposes a possible move that will buy us some breathing > room in the perpetual problem where the keyword-management tail wags the > programming-model dog. > > > ## We need more keywords, captain! > > Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to > be used as identifiers.? This set has remained quite stable over the > years (for good reason), with the exceptions of `assert` added in 1.4, > `enum` added in 5, and `_` added in 9.? In addition, there are also > several _reserved identifiers_ (`true`, `false`, and `null`) which > behave almost like keywords. > > Over time, as the language evolves, language designers face a > challenge; the set of keywords imagined in version 1.0 are rarely > suitable for expressing all the things we might ever want our language > to express.? We have several tools at our disposal for addressing this > problem: > > ?- Eminent domain.? Take words that were previously identifiers, and > ?? turn them into keywords, as we did with `assert` in 1.4. > > ?- Recycle.? Repurpose an existing keyword for something that it was > ?? never really meant for (such as using `default` for annotation > ?? values or default methods). > > ?- Do without.? Find a way to pick a syntax that doesn't require a > ?? new keyword, such as using `@interface` for annotations instead of > ?? `annotation` -- or don't do the feature at all. > > ?- Smoke and mirrors.? Create the illusion of context-dependent > ?? keywords through various linguistic heroics (restricted keywords, > ?? reserved type names.) > > In any given situation, all of these options are on the table -- but > most of the time, none of these options are very good.? The lack of > reasonable options for extending the syntax of the language threatens > to become a significant impediment to language evolution. > > #### Why not "just" make new keywords? > > While it may be legal for us to declare `i` to be a keyword in a > future version of Java, this would likely break every program in the > world,? since `i` is used so commonly as an identifier.? (When the > `assert` keyword was added in 1.4, it broke every testing framework.) > The cost of remediating the effect of such incompatible changes varies > as well; invalidating a name choice for a local variable has a local > fix,? but invalidating the name of a public type or an interface > method might well be fatal. > > Additionally, the keywords we're likely to want to reclaim are often > those that are popular as identifiers (e.g., `value`, `var`, > `method`), making such fatal collisions more likely.? In some cases, > if the keyword candidate in question is sufficiently rarely used as an > identifier, we might still opt to take that source-compatibility hit > -- but names that are less likely to collide (e.g., > `usually_but_not_always_final`) are likely not the ones we want in our > language. Realistically, this is unlikely to be a well we can go to > very often, and the bar must be very high. > > #### Why not "just" live with the keywords we have? > > Reusing keywords in multiple contexts has ample precedent in > programming languages, including Java.? (For example, we (ab)use `final` > for "not mutable", "not overridable", and "not extensible".) > Sometimes, using an existing keyword in a new context is natural and > sensible, but usually it's not our first choice.? Over time, as the > range of demands we place on our keyword set expands, this may well > descend into the ridiculous; no one wants to use `null final` as a way > of negating finality.? (While one might think such things are too > ridiculous to consider, note that we received serious-seeming > suggestions during JEP 325 to use `new switch` to describe a switch > with different semantics.? Presumably to be followed by `new new > switch` in ten years.) > > Of course, one way to live without making new keywords is to stop > evolving the language entirely.? While there are some who think this > is a fine idea, doing so because of the lack of available tokens would > be a silly reason. We are convinced that Java has a long life ahead of > it, and developers are excited about new features that enable to them > to write more expressive and reliable code. > > #### Why not "just" make contextual keywords? > > At first glance, contextual keywords (and their friends, such as > reserved type identifiers) may appear to be a magic wand; they let us > create the illusion of adding new keywords without breaking existing > programs.? But the positive track record of contextual keywords hides > a great deal of complexity and distortion. > > Each grammar position is its own story; contextual keywords that might > be used as modifiers (e.g., `readonly`) have different ambiguity > considerations than those that might be use in code (e.g., a `matches` > expression).? The process of selecting a contextual keyword is not a > simple matter of adding it to the grammar; each one requires an > analysis of potential current and future interactions.? Similarly, > each token we try to repurpose may have its own special > considerations;? for example, we could justify the use of `var` as a > reserved type name? because because the naming conventions are so > broadly adhered to.? Finally, the use of contextual keywords in > certain? syntactic positions can create additional considerations for > extending the syntax later. > > Contextual keywords create complexity for specifications, compilers, > and IDEs.? With one or two special cases, we can often deal well > enough, but if special cases were to become more pervasive, this would > likely result in more significant maintenance costs or bug tail. While > it is easy to dismiss this as ?not my problem?, in reality, this is > everybody?s problem. IDEs often have to guess whether a use of a > contextual keyword is a keyword or identifier, and it may not have > enough information to make a good guess until it?s seen more input. > This results in worse user highlighting, auto-completion, and > refactoring abilities ? or worse.? These problems quickly become > everyone's problems. I fully agree on the cost for the specification cost, but contextual keywords have mostly a single cost in term of implementation that you pay once for every local keywords. Once a lexer/parser have pay that cost (which is not negligible), the cost of each new keywords is not a lot and very close to zero if you have a parser generator. Because a contextual keyword is recognized by the parser, it doesn't worsen any IDE/compiler features that is built on top of the parser, so auto-completion, refactoring, etc are not impacted. Syntax highlighting can be impacted depending how it's implemented (on top of the lexer vs on top of the parser). And i don't get what are the "additional considerations for extending the syntax later" ? > > So, while contextual keywords are one of the tools in our toolbox, > they should also be used sparingly. yes ! > > #### Why is this a problem? > > Aside from the obvious consequences of these problems (clunky syntax, > complexity, bugs), there is a more insidious hidden cost -- > distortion.? The accidental details of keyword management pose a > constant risk of distortion in language design. > > One could consider the choice to use `@interface` instead of > `annotation` for annotations to be a distortion; having a descriptive > name rather than a funky combination of punctuation and keyword would > surely have made it easier for people to become familiar with > annotations. > > In another example, the set of modifiers (`public`, `private`, > `static`, `final`, etc) is not complete; there is no way to say ?not > final? or ?not static?. This, in turn, means that we cannot create > features where variables or classes are `final` by default, or members > are `static` by default, because there?s no way to denote the desire > to opt out of it.? While there may be reasons to justify a locally > suboptimal default anyway (such as global consistency), we want to > make these choices deliberately, not have them made for us by the > accidental details of keyword management. Choosing to leave out a > feature for reasons of simplicity is fine; leaving it out because we > don't have a way to denote the obvious semantics is not. > > It may not be obvious from the outside, but this is a constant problem > in evolving the language, and an ongoing tax that we all pay, directly > or indirectly. > > ## We need a new source of keyword candidates > > Every time we confront this problem, the overwhelming tendency is to > punt and pick one of the bad options, because the problem only comes > along every once in a while.? But, with the features in the pipeline, I > expect it will continue to come along with some frequency, and I?d > rather get ahead of it. Given that all of these current options are > problematic, and there is not even a least-problematic move that > applies across all situations, my inclination is to try to expand the > set of lexical forms that can be used as keywords. > > As a not-serious example, take the convention that we?ve used for > experimental features, where we prefix provisional keywords in > prototypes with two underscores, as we did with `__ByValue` in the > Valhalla prototype. (We commonly do this in feature proposals and > prototypes, mostly to signify ?this keyword is a placeholder for a > syntax decision to be made later?, but also because it permits a > simple implementation that is unlikely to collide with existing code.) > We could, for example, carve out the space of identifiers that begin > with underscore as being reserved for keywords. Of course, this isn?t > so pretty, and it also means we'd have a mix of underscore and > non-underscore keywords, so it?s not a serious suggestion, as much as > an example of the sort of move we are looking for. > > But I do have a serious suggestion: allow _hyphenated_ keywords where > one or more of the terms are already keywords or reserved identifiers. > Unlike restricted keywords, this creates much less trouble for > parsing, as (for example) `non-null` cannot be confused for a > subtraction expression, and the lexer can always tell with fixed > lookahead whether `a-b` is three tokens or one. This gives us a lot > more room for creating new, less-conflicting keywords. And these new > keywords are likely to be good names, too, as many of the missing > concepts we want to add describe their relationship to existing > language constructs -- such as `non-null`. Technically, it's not a lookahead which is a parser thing, it's the lexer being greedy. > > Here?s some examples where this approach might yield credible > candidates. (Note: none of these are being proposed here; this is > merely an illustrative list of examples of how this mechanism could > form keywords that might, in some particular possible future, be > useful and better than the alternatives we have now.) > > ? - `non-null` > ? - `non-final` > ? - `package-private` (the default accessibility for class members, > currently not denotable) > ? - `public-read` (publicly readable, privately writable) > ? - `null-checked` > ? - `type-static` (a concept needed in Valhalla, which is static > relative to a particular specialization of a class, rather than the > class itself) > ? - `default-value` > ? - `eventually-final` (what the `@Stable` annotation currently suggests) > ? - `semi-final` (an alternative to `sealed`) > ? - `exhaustive-switch` (opting into exhaustiveness checking for statement > ??? switches) > ? - `enum-class`, `annotation-class`, `record-class` (we might have > chosen these > ???? as an alternative to `enum` and `@interface`, had we had the option) > ? - `this-class` (to describe the class literal for the current class) > ? - `this-return` (a common request is a way to mark a setter or > builder method > ??? as returning its receiver) > > (Again, the point is not to debate the merits of any of these specific > examples; the point is merely to illustrate what we might be able to do > with such a mechanism.) > > Having this as an option doesn't mean we can't also use the other > approaches when they are suitable; it just means we have more, and > likely less fraught, options with which to make better decisions. > > There are likely to be other lexical schemes by which new keywords can > be created without impinging on existing code; this one seems credible > and reasonably parsable by both machines and humans. > > #### "But that's ugly" > > Invariably, some percentage of readers will have an immediate and > visceral reaction to this idea.? Let's stipulate for the record that > some people will find this ugly.? (At least, at first.? Many such > reactions are possibly-transient (see what I did there?) responses > to unfamiliarity.) R?mi From guy.steele at oracle.com Thu Jan 17 11:23:28 2019 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 17 Jan 2019 11:23:28 +0000 Subject: We need more keywords, captain! In-Reply-To: <5961639.901747.1547718324920.JavaMail.zimbra@u-pem.fr> References: <86DCF84D-5C37-47C5-91BA-93385206FD49@oracle.com> <5961639.901747.1547718324920.JavaMail.zimbra@u-pem.fr> Message-ID: I am persuaded by your argument. I thought we should consider break-return, but I am now convinced that overall break-with is the better choice. ?Guy Sent from my iPhone > On Jan 17, 2019, at 9:45 AM, Remi Forax wrote: > > I think i prefer break-with, > the problem of break-return is that people will write it break return without the hyphen, break return is in my opinion too close to return if you read the code too fast and a break return without a value means nothing unlike a regular return. > > I like break-with because it's obvious that you have to say with what value you want to break, which is exactly the issue we have with the current break syntax. > > So i vote for break-with instead of break, > as Brian said, the expression switch is currently a preview feature of 12 so we can still tweak the syntax a bit. > > R?mi > > ----- Mail original ----- >> De: "Guy Steele" >> ?: "Brian Goetz" >> Cc: "amber-spec-experts" >> Envoy?: Mardi 8 Janvier 2019 18:23:36 >> Objet: Re: We need more keywords, captain! > >> Actually, even better than `break-with` would be `break-return`. It?s clearly a >> kind of `break`, and also clearly a kind of `return`. >> >> I think maybe this application alone has won me over to the idea of hyphenated >> keywords. >> >> (Then again, for this specific application we don?t even need the hyphen; we >> could just write `break return v;`.) >> >> ?Guy >> >>> On Jan 8, 2019, at 12:35 PM, Brian Goetz wrote: >>> >>> When discussing this today at our compiler meeting, we realized a few more >>> places where the lack of keywords produce distortions we don't even notice. In >>> expression switch, we settled on `break value` as the way to provide a value >>> for a switch expression when the shorthand (`case L -> e`) doesn't suffice, but >>> this was painful for everyone. It's painful for users because there's now work >>> required to disambiguate whether `break foo` is a labeled break or a value >>> break; it was even more painful to specify, because a new form of abrupt >>> completion had to be threaded through the spec. >>> >>> Being able to call this something like `break-with v` (or some other derived >>> keyword) would have made this all a lot simpler. (BTW, we can still do this, >>> since expression-switch is still in preview.) >>> >>> Moral of the story: even just a few minutes of brainstorming led us to several >>> applications of this approach that we hadn't seen a few days ago. >>> >>>> On 1/8/2019 10:22 AM, Brian Goetz wrote: >>>> This document proposes a possible move that will buy us some breathing room in >>>> the perpetual problem where the keyword-management tail wags the >>>> programming-model dog. >>>> >>>> >>>> ## We need more keywords, captain! >>>> >>>> Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to >>>> be used as identifiers. This set has remained quite stable over the >>>> years (for good reason), with the exceptions of `assert` added in 1.4, >>>> `enum` added in 5, and `_` added in 9. In addition, there are also >>>> several _reserved identifiers_ (`true`, `false`, and `null`) which >>>> behave almost like keywords. >>>> >>>> Over time, as the language evolves, language designers face a >>>> challenge; the set of keywords imagined in version 1.0 are rarely >>>> suitable for expressing all the things we might ever want our language >>>> to express. We have several tools at our disposal for addressing this >>>> problem: >>>> >>>> - Eminent domain. Take words that were previously identifiers, and >>>> turn them into keywords, as we did with `assert` in 1.4. >>>> >>>> - Recycle. Repurpose an existing keyword for something that it was >>>> never really meant for (such as using `default` for annotation >>>> values or default methods). >>>> >>>> - Do without. Find a way to pick a syntax that doesn't require a >>>> new keyword, such as using `@interface` for annotations instead of >>>> `annotation` -- or don't do the feature at all. >>>> >>>> - Smoke and mirrors. Create the illusion of context-dependent >>>> keywords through various linguistic heroics (restricted keywords, >>>> reserved type names.) >>>> >>>> In any given situation, all of these options are on the table -- but >>>> most of the time, none of these options are very good. The lack of >>>> reasonable options for extending the syntax of the language threatens >>>> to become a significant impediment to language evolution. >>>> >>>> #### Why not "just" make new keywords? >>>> >>>> While it may be legal for us to declare `i` to be a keyword in a >>>> future version of Java, this would likely break every program in the >>>> world, since `i` is used so commonly as an identifier. (When the >>>> `assert` keyword was added in 1.4, it broke every testing framework.) >>>> The cost of remediating the effect of such incompatible changes varies >>>> as well; invalidating a name choice for a local variable has a local >>>> fix, but invalidating the name of a public type or an interface >>>> method might well be fatal. >>>> >>>> Additionally, the keywords we're likely to want to reclaim are often >>>> those that are popular as identifiers (e.g., `value`, `var`, >>>> `method`), making such fatal collisions more likely. In some cases, >>>> if the keyword candidate in question is sufficiently rarely used as an >>>> identifier, we might still opt to take that source-compatibility hit >>>> -- but names that are less likely to collide (e.g., >>>> `usually_but_not_always_final`) are likely not the ones we want in our >>>> language. Realistically, this is unlikely to be a well we can go to >>>> very often, and the bar must be very high. >>>> >>>> #### Why not "just" live with the keywords we have? >>>> >>>> Reusing keywords in multiple contexts has ample precedent in >>>> programming languages, including Java. (For example, we (ab)use `final` >>>> for "not mutable", "not overridable", and "not extensible".) >>>> Sometimes, using an existing keyword in a new context is natural and >>>> sensible, but usually it's not our first choice. Over time, as the >>>> range of demands we place on our keyword set expands, this may well >>>> descend into the ridiculous; no one wants to use `null final` as a way >>>> of negating finality. (While one might think such things are too >>>> ridiculous to consider, note that we received serious-seeming >>>> suggestions during JEP 325 to use `new switch` to describe a switch >>>> with different semantics. Presumably to be followed by `new new >>>> switch` in ten years.) >>>> >>>> Of course, one way to live without making new keywords is to stop >>>> evolving the language entirely. While there are some who think this >>>> is a fine idea, doing so because of the lack of available tokens would >>>> be a silly reason. We are convinced that Java has a long life ahead of >>>> it, and developers are excited about new features that enable to them >>>> to write more expressive and reliable code. >>>> >>>> #### Why not "just" make contextual keywords? >>>> >>>> At first glance, contextual keywords (and their friends, such as >>>> reserved type identifiers) may appear to be a magic wand; they let us >>>> create the illusion of adding new keywords without breaking existing >>>> programs. But the positive track record of contextual keywords hides >>>> a great deal of complexity and distortion. >>>> >>>> Each grammar position is its own story; contextual keywords that might >>>> be used as modifiers (e.g., `readonly`) have different ambiguity >>>> considerations than those that might be use in code (e.g., a `matches` >>>> expression). The process of selecting a contextual keyword is not a >>>> simple matter of adding it to the grammar; each one requires an >>>> analysis of potential current and future interactions. Similarly, >>>> each token we try to repurpose may have its own special >>>> considerations; for example, we could justify the use of `var` as a >>>> reserved type name because because the naming conventions are so >>>> broadly adhered to. Finally, the use of contextual keywords in >>>> certain syntactic positions can create additional considerations for >>>> extending the syntax later. >>>> >>>> Contextual keywords create complexity for specifications, compilers, >>>> and IDEs. With one or two special cases, we can often deal well >>>> enough, but if special cases were to become more pervasive, this would >>>> likely result in more significant maintenance costs or bug tail. While >>>> it is easy to dismiss this as ?not my problem?, in reality, this is >>>> everybody?s problem. IDEs often have to guess whether a use of a >>>> contextual keyword is a keyword or identifier, and it may not have >>>> enough information to make a good guess until it?s seen more input. >>>> This results in worse user highlighting, auto-completion, and >>>> refactoring abilities ? or worse. These problems quickly become >>>> everyone's problems. >>>> >>>> So, while contextual keywords are one of the tools in our toolbox, >>>> they should also be used sparingly. >>>> >>>> #### Why is this a problem? >>>> >>>> Aside from the obvious consequences of these problems (clunky syntax, >>>> complexity, bugs), there is a more insidious hidden cost -- >>>> distortion. The accidental details of keyword management pose a >>>> constant risk of distortion in language design. >>>> >>>> One could consider the choice to use `@interface` instead of >>>> `annotation` for annotations to be a distortion; having a descriptive >>>> name rather than a funky combination of punctuation and keyword would >>>> surely have made it easier for people to become familiar with >>>> annotations. >>>> >>>> In another example, the set of modifiers (`public`, `private`, >>>> `static`, `final`, etc) is not complete; there is no way to say ?not >>>> final? or ?not static?. This, in turn, means that we cannot create >>>> features where variables or classes are `final` by default, or members >>>> are `static` by default, because there?s no way to denote the desire >>>> to opt out of it. While there may be reasons to justify a locally >>>> suboptimal default anyway (such as global consistency), we want to >>>> make these choices deliberately, not have them made for us by the >>>> accidental details of keyword management. Choosing to leave out a >>>> feature for reasons of simplicity is fine; leaving it out because we >>>> don't have a way to denote the obvious semantics is not. >>>> >>>> It may not be obvious from the outside, but this is a constant problem >>>> in evolving the language, and an ongoing tax that we all pay, directly >>>> or indirectly. >>>> >>>> ## We need a new source of keyword candidates >>>> >>>> Every time we confront this problem, the overwhelming tendency is to >>>> punt and pick one of the bad options, because the problem only comes >>>> along every once in a while. But, with the features in the pipeline, I >>>> expect it will continue to come along with some frequency, and I?d >>>> rather get ahead of it. Given that all of these current options are >>>> problematic, and there is not even a least-problematic move that >>>> applies across all situations, my inclination is to try to expand the >>>> set of lexical forms that can be used as keywords. >>>> >>>> As a not-serious example, take the convention that we?ve used for >>>> experimental features, where we prefix provisional keywords in >>>> prototypes with two underscores, as we did with `__ByValue` in the >>>> Valhalla prototype. (We commonly do this in feature proposals and >>>> prototypes, mostly to signify ?this keyword is a placeholder for a >>>> syntax decision to be made later?, but also because it permits a >>>> simple implementation that is unlikely to collide with existing code.) >>>> We could, for example, carve out the space of identifiers that begin >>>> with underscore as being reserved for keywords. Of course, this isn?t >>>> so pretty, and it also means we'd have a mix of underscore and >>>> non-underscore keywords, so it?s not a serious suggestion, as much as >>>> an example of the sort of move we are looking for. >>>> >>>> But I do have a serious suggestion: allow _hyphenated_ keywords where >>>> one or more of the terms are already keywords or reserved identifiers. >>>> Unlike restricted keywords, this creates much less trouble for >>>> parsing, as (for example) `non-null` cannot be confused for a >>>> subtraction expression, and the lexer can always tell with fixed >>>> lookahead whether `a-b` is three tokens or one. This gives us a lot >>>> more room for creating new, less-conflicting keywords. And these new >>>> keywords are likely to be good names, too, as many of the missing >>>> concepts we want to add describe their relationship to existing >>>> language constructs -- such as `non-null`. >>>> >>>> Here?s some examples where this approach might yield credible >>>> candidates. (Note: none of these are being proposed here; this is >>>> merely an illustrative list of examples of how this mechanism could >>>> form keywords that might, in some particular possible future, be >>>> useful and better than the alternatives we have now.) >>>> >>>> - `non-null` >>>> - `non-final` >>>> - `package-private` (the default accessibility for class members, currently not >>>> denotable) >>>> - `public-read` (publicly readable, privately writable) >>>> - `null-checked` >>>> - `type-static` (a concept needed in Valhalla, which is static relative to a >>>> particular specialization of a class, rather than the class itself) >>>> - `default-value` >>>> - `eventually-final` (what the `@Stable` annotation currently suggests) >>>> - `semi-final` (an alternative to `sealed`) >>>> - `exhaustive-switch` (opting into exhaustiveness checking for statement >>>> switches) >>>> - `enum-class`, `annotation-class`, `record-class` (we might have chosen these >>>> as an alternative to `enum` and `@interface`, had we had the option) >>>> - `this-class` (to describe the class literal for the current class) >>>> - `this-return` (a common request is a way to mark a setter or builder method >>>> as returning its receiver) >>>> >>>> (Again, the point is not to debate the merits of any of these specific >>>> examples; the point is merely to illustrate what we might be able to do >>>> with such a mechanism.) >>>> >>>> Having this as an option doesn't mean we can't also use the other >>>> approaches when they are suitable; it just means we have more, and >>>> likely less fraught, options with which to make better decisions. >>>> >>>> There are likely to be other lexical schemes by which new keywords can >>>> be created without impinging on existing code; this one seems credible >>>> and reasonably parsable by both machines and humans. >>>> >>>> #### "But that's ugly" >>>> >>>> Invariably, some percentage of readers will have an immediate and >>>> visceral reaction to this idea. Let's stipulate for the record that >>>> some people will find this ugly. (At least, at first. Many such >>>> reactions are possibly-transient (see what I did there?) responses >>>> to unfamiliarity.) >>>> >>>> From brian.goetz at oracle.com Thu Jan 17 16:26:06 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Jan 2019 11:26:06 -0500 Subject: break-with In-Reply-To: References: Message-ID: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> > Being able to call this something like `break-with v` (or some other derived keyword) would have made this all a lot simpler. (BTW, we can still do this, since expression-switch is still in preview.) It seems we?re all in favor of break-with over unadorned ?break?? Which feeds into the bigger question about promoting expression switch to final in 13. I don?t think this syntactic change on its own merits re-previewing the feature; this is exactly the sort of ?feature is finished, but we might change the paint color based on feedback? kind of thing that the preview mechanism was intended for. We don?t have to make this decision quite yet, but sometime between now and feature-freeze for 13 (June) we have to take one of the following actions: - File a JEP to make it a permanent feature, possibly with changes - File a JEP to re-preview it, possibly with changes - Withdraw the feature We can continue to gather feedback on the feature and revisit later. From brian.goetz at oracle.com Thu Jan 17 16:50:36 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Jan 2019 11:50:36 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> Message-ID: <95AA2856-D923-44DF-B0E8-F813617FE9D7@oracle.com> > Allowing public auxillary subtype of a primary sealed type is the sweet spot for me, better than trying to introduce either a nesting which is not exactly nesting or a rule than only works for pattern matching. It was not my intent to propose something that ?only works for pattern matching? (I presume you?re thinking about the treatment of enums in switch, and carrying that over more or less directly.) I was suggesting something a little broader; if you have a sealed type X, and you import X, you would automatically get X.{A..Z} statically imported where A..Z are subtypes. This gives you the enum behavior, but more broadly; you can say ?new A?, etc. (We can consider extending this to enums as well, since enums and sealed types have such close affinity.) This is still less intrusive than public aux types. But, even adopting the ?enum? behavior might well be good enough, has precedent, and is surely simpler; the place where nesting would bite the most is in switches, and this would provide relief. Further, I suspect that the ?public aux subtypes of primary sealed type? will be received by the audience more as ?glass half empty?; rather than being happy about the new situations where they could use aux types, they?ll be annoyed at where they can?t, or frustrated with the complexity of the rule. Finally, the arguments against using aux types (findability) have some merit. So I was looking for something less sharp-edged. > I don't understand how "semi-final" can be a good keyword, the name is too vague. Given that the proposal introduce the notion of sealed types, "sealed" is a better keyword. There?s two sides here. The connection to finality is powerful, and I like that. On the other hand, semi-final might sound nonsensical (like ?half pregnant?) to some, and silly (because of the pun) to others. So I?l accept that this is likely to strike some people as ?too clever? and cause more than its share of unnecessary whining. Contextual keywords are usually OK as modifiers (as long as they don?t want to show up somewhere else), so `sealed` is not terrible. In earlier discussions, there was some concern about sealed vs final (Kevin), especially with regard to negation. I thought about this some more and I think we can say: - A subtype of a sealed type is implicitly sealed. - If that subtype is a concrete class without a permits clause, then it is effectively final, though not actually final. (You can say final explicitly if you want the belt-and-suspenders.) - You can un-do the inheritance of sealing with ?non-sealed?, whether the subtype is a class or interface, abstract or concrete. So no non-sealed vs non-final confusion. > For un-sealing a subtype, "unsealed" seems to be a good keyword. If the keyword is `sealed`, then I strongly prefer its opposite be `non-sealed`. (Among other reasons, I don?t want to open the door to having different inversions for different modifiers.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 17 17:47:02 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 17 Jan 2019 18:47:02 +0100 (CET) Subject: break-with In-Reply-To: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> Message-ID: <494172826.1054481.1547747222643.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Jeudi 17 Janvier 2019 17:26:06 > Objet: break-with >> Being able to call this something like `break-with v` (or some other derived >> keyword) would have made this all a lot simpler. (BTW, we can still do this, >> since expression-switch is still in preview.) > > It seems we?re all in favor of break-with over unadorned ?break?? > > Which feeds into the bigger question about promoting expression switch to final > in 13. I don?t think this syntactic change on its own merits re-previewing the > feature; this is exactly the sort of ?feature is finished, but we might change > the paint color based on feedback? kind of thing that the preview mechanism was > intended for. > > We don?t have to make this decision quite yet, but sometime between now and > feature-freeze for 13 (June) we have to take one of the following actions: > > - File a JEP to make it a permanent feature, possibly with changes > - File a JEP to re-preview it, possibly with changes > - Withdraw the feature > > We can continue to gather feedback on the feature and revisit later. Like last year, i (with Jos? Paumar) will run a poll at Devoxx France (17th to 19th of April) on the expression switch, i can ask if it should be included permanently, still in preview or withdrawn and post the result on this list. R?mi From brian.goetz at oracle.com Thu Jan 17 17:49:20 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Jan 2019 12:49:20 -0500 Subject: break-with In-Reply-To: <494172826.1054481.1547747222643.JavaMail.zimbra@u-pem.fr> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> <494172826.1054481.1547747222643.JavaMail.zimbra@u-pem.fr> Message-ID: <22755145-9A5D-467E-84C0-AB9F597DF847@oracle.com> > Like last year, i (with Jos? Paumar) will run a poll at Devoxx France (17th to 19th of April) on the expression switch, > i can ask if it should be included permanently, still in preview or withdrawn and post the result on this list. Honestly, I think such a poll should be limited people who have written at least 1000 lines of code with the new feature? otherwise we will get a very noisy answer, where people either make a decision with no information, or only with the information of the most vocal rant they?ve read on Twitter about it. If you have any idea where we can find more than three people who meet this requirement in the same room ? that would be very useful! From forax at univ-mlv.fr Thu Jan 17 18:00:43 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Jan 2019 19:00:43 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: <95AA2856-D923-44DF-B0E8-F813617FE9D7@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> <95AA2856-D923-44DF-B0E8-F813617FE9D7@oracle.com> Message-ID: <1957606104.1056433.1547748043416.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Janvier 2019 17:50:36 > Objet: Re: Sealed types -- updated proposal >> Allowing public auxillary subtype of a primary sealed type is the sweet spot for >> me, better than trying to introduce either a nesting which is not exactly >> nesting or a rule than only works for pattern matching. > It was not my intent to propose something that ?only works for pattern matching? > (I presume you?re thinking about the treatment of enums in switch, and carrying > that over more or less directly.) I was suggesting something a little broader; > if you have a sealed type X, and you import X, you would automatically get > X.{A..Z} statically imported where A..Z are subtypes. This gives you the enum > behavior, but more broadly; you can say ?new A?, etc. (We can consider > extending this to enums as well, since enums and sealed types have such close > affinity.) This is still less intrusive than public aux types. > But, even adopting the ?enum? behavior might well be good enough, has precedent, > and is surely simpler; the place where nesting would bite the most is in > switches, and this would provide relief. > Further, I suspect that the ?public aux subtypes of primary sealed type? will be > received by the audience more as ?glass half empty?; rather than being happy > about the new situations where they could use aux types, they?ll be annoyed at > where they can?t, or frustrated with the complexity of the rule. Finally, the > arguments against using aux types (findability) have some merit. So I was > looking for something less sharp-edged. >> I don't understand how "semi-final" can be a good keyword, the name is too >> vague. Given that the proposal introduce the notion of sealed types, "sealed" >> is a better keyword. > There?s two sides here. The connection to finality is powerful, and I like that. > On the other hand, semi-final might sound nonsensical (like ?half pregnant?) to > some, and silly (because of the pun) to others. So I?l accept that this is > likely to strike some people as ?too clever? and cause more than its share of > unnecessary whining. > Contextual keywords are usually OK as modifiers (as long as they don?t want to > show up somewhere else), so `sealed` is not terrible. ???, i'm confused, a contextual keyword means it's only a keyword in some context, so if it shows up somewhere else, it's not a keyword. That's said, i think we should have a strategy (like with '_') to gradually promote contextual keywords to be real keywords (maybe apart the ones in the module-info). If a contextual keyword is introduced in release N, the compiler should also emit a warning for all identifiers with the same name. in release N + K, the compiler can now emit an error instead of a warning. > In earlier discussions, there was some concern about sealed vs final (Kevin), > especially with regard to negation. I thought about this some more and I think > we can say: > - A subtype of a sealed type is implicitly sealed. > - If that subtype is a concrete class without a permits clause, then it is > effectively final, though not actually final. (You can say final explicitly if > you want the belt-and-suspenders.) > - You can un-do the inheritance of sealing with ?non-sealed?, whether the > subtype is a class or interface, abstract or concrete. So no non-sealed vs > non-final confusion. >> For un-sealing a subtype, "unsealed" seems to be a good keyword. > If the keyword is `sealed`, then I strongly prefer its opposite be `non-sealed`. > (Among other reasons, I don?t want to open the door to having different > inversions for different modifiers.) non- is good for me. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jan 17 18:03:21 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Jan 2019 13:03:21 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <1957606104.1056433.1547748043416.JavaMail.zimbra@u-pem.fr> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> <95AA2856-D923-44DF-B0E8-F813617FE9D7@oracle.com> <1957606104.1056433.1547748043416.JavaMail.zimbra@u-pem.fr> Message-ID: > > > Contextual keywords are usually OK as modifiers (as long as they don?t want to show up somewhere else), so `sealed` is not terrible. > > ???, i'm confused, > a contextual keyword means it's only a keyword in some context, so if it shows up somewhere else, it's not a keyword. Consider ?extends?. Yes, it shows up in the class declaration grammar, but it also wants to show up in type bounds: . We have to think very carefully whether a contextual keyword we introduce today, when we might only be thinking of the declaration, will come back later in less constrained syntactic contexts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 17 18:07:38 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Jan 2019 19:07:38 +0100 (CET) Subject: break-with In-Reply-To: <22755145-9A5D-467E-84C0-AB9F597DF847@oracle.com> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> <494172826.1054481.1547747222643.JavaMail.zimbra@u-pem.fr> <22755145-9A5D-467E-84C0-AB9F597DF847@oracle.com> Message-ID: <340613954.1057062.1547748458442.JavaMail.zimbra@u-pem.fr> The "choose your own adventure" solution, you have a kata on the expression switch, each question of the kata give you a small number when you run it, if you concatenate all the numbers, you have the poll id ! R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Janvier 2019 18:49:20 > Objet: Re: break-with >> Like last year, i (with Jos? Paumar) will run a poll at Devoxx France (17th to >> 19th of April) on the expression switch, >> i can ask if it should be included permanently, still in preview or withdrawn >> and post the result on this list. > > Honestly, I think such a poll should be limited people who have written at least > 1000 lines of code with the new feature? otherwise we will get a very noisy > answer, where people either make a decision with no information, or only with > the information of the most vocal rant they?ve read on Twitter about it. > > If you have any idea where we can find more than three people who meet this > requirement in the same room ? that would be very useful! From forax at univ-mlv.fr Thu Jan 17 18:14:28 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Jan 2019 19:14:28 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> <95AA2856-D923-44DF-B0E8-F813617FE9D7@oracle.com> <1957606104.1056433.1547748043416.JavaMail.zimbra@u-pem.fr> Message-ID: <547724913.1057879.1547748868450.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Janvier 2019 19:03:21 > Objet: Re: Sealed types -- updated proposal >>> Contextual keywords are usually OK as modifiers (as long as they don?t want to >>> show up somewhere else), so `sealed` is not terrible. >> ???, i'm confused, >> a contextual keyword means it's only a keyword in some context, so if it shows >> up somewhere else, it's not a keyword. > Consider ?extends?. Yes, it shows up in the class declaration grammar, but it > also wants to show up in type bounds: . We have to think very > carefully whether a contextual keyword we introduce today, when we might only > be thinking of the declaration, will come back later in less constrained > syntactic contexts. Ok, "extends" is not a good example here ! (i force my student as read it as 'subtypes' if encosed in <>) What you are saying is that what if we introduce a contextual keyword and after think that the same construct should also be allowed in the middle of the code where the context keyword is ambiguous, we are in trouble. Yes, that's an issue. Thanks ! R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jan 17 18:20:05 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Jan 2019 13:20:05 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> Message-ID: <3E0B65B1-F965-4F9A-9266-3428CA1AE2D7@oracle.com> > Given that the proposal introduce the notion of sealed types, "sealed" is a better keyword. Note that `sealed` already has a meaning in the context of packages (see Package.isSealed()), though it is minor. Is there a different hyphenation of final other than semi-final that maintains the connection to finality, but doesn?t weird people out? (One thing I dislike about sealed / non-sealed is that now we have _contextual_ keywords with hyphens, which wasn?t the discipline we were aiming for in the hyphenation proposal.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Thu Jan 17 19:39:41 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Thu, 17 Jan 2019 11:39:41 -0800 Subject: Hyphenated keywords and switch expressions In-Reply-To: <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CF3AB.5030602@oracle.com> <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> Message-ID: <5C40D9FD.3080405@oracle.com> Thanks Gavin. The "Jan 2019" edition looks good. The relative shapes of switch statements and switch expressions can be easily discerned by reading [1] and [2] side by side. The renumbering, which fits with my plans for the JLS, is also welcome in advance of the public commentary that we can expect on this spec come JDK 12 GA. Alex [1] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11.2 [2] http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-15.28.1 On 1/17/2019 1:14 AM, Gavin Bierman wrote: > Thank you Alex and Tagir. I have uploaded a new version of the spec > at: > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > This contains all the changes you suggested below. In addition, there > is a small bug fix in 5.6.3 concerning widening > (https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken > the opportunity to reorder chapter 15 slightly, so switch expressions > are now section 15.28 and constant expressions are now section 15.29 > (the last section in the chapter). > > Comments welcome! Gavin > > >> On 14 Jan 2019, at 21:40, Alex Buckley >> wrote: >> >> Hi Gavin, >> >> Some points driven partly by the discussion with Tagir: >> >> 1. In 14.11.1, SwitchLabeledBlock should not end with a `;` -- >> there is no indication in JEP 325 that a semicolon is desired after >> `-> {...}` and javac in JDK 12 does not accept one there. Also, >> SwitchLabeledThrowStatement should not end with a `;` because >> ThrowStatement includes a `;`. >> >> 2. In 14.11.1, "This block can either be empty, or take one of two >> forms:" is wrong for switch expressions. The emptiness allowed by >> the grammar will be banned semantically in 15.29.1, so 14.11.1 >> should avoid trouble by speaking broadly of the forms in an >> educational tone: "A switch block can consist of either: - _Switch >> labeled rules_, which use `->` to introduce either a _switch >> labeled expression_, ..." Also, "optionally followed by switch >> labels." is wrong for switch expressions, so prefer: "- _Switch >> labeled statement groups_, which use `:` to introduce block >> statements." >> >> 3. In 15.29.1: (this is mainly driven by eyeballing against >> 14.11.2) >> >> - Incorrect Markdown in section header. >> >> - The error clause in the following bullet is redundant because the >> list header already called for an error: "The switch block must be >> compatible with the type of the selector expression, *****or a >> compile-time error occurs*****." >> >> - I would prefer to pull the choice of {default label, enum typed >> selector expression} into a fourth bullet of the prior list, to >> align how 14.11.2's list has a bullet concerning default label. >> >> - The significant rule from 14.11.2 that "If the switch block >> consists of switch labeled rules, then any switch labeled >> expression must be a statement expression (14.8)." has no parallel >> in 15.29.1. Instead, for switch labeled rules, 15.29.1 has a rule >> for switch labeled blocks. (1) We haven't seen switch labeled >> blocks for ages, so a cross-ref to 14.11.1 is due. (2) A note that >> switch exprs allow `-> ANY_EXPRESSION` while switch statements >> allow `-> NOT_ANY_EXPRESSION` is due in both sections; grep ch.8 >> for "In this respect" to see what I mean. (3) The semantic >> constraints on switch labeled rules+statement groups in 15.29.1 >> should be easily contrastable with those in 14.11.2 -- one approach >> is to pull the following constraints into 15.29.1's "all conditions >> true, or error" list: >> >> ----- - If the switch block consists of switch labeled rules, then >> any switch labeled block (14.11.1) MUST COMPLETE ABRUPTLY. - If the >> switch block consists of switch labeled statement groups, then the >> last statement in the switch block MUST COMPLETE ABRUPTLY, and the >> switch block MUST NOT HAVE ANY SWITCH LABELS AFTER THE LAST SWITCH >> LABELED STATEMENT GROUP. ----- >> >> If you prefer to keep these semantic constraints standalone so that >> they have negative polarity, then 14.11.2 should to the same for >> its significant-but-easily-missed "must be a statement expression" >> constraint. >> >> Alex >> >> On 1/13/2019 2:53 AM, Tagir Valeev wrote: >>> Hello! >>> >>>> I'm concerned about any claim of ambiguity in the grammar, >>>> though I'm not sure I'm following you correctly. I agree that >>>> your first fragment is parsed as two statements -- a switch >>>> statement and an empty statement -- but I don't know what you >>>> mean about "inside switch expression rule" for your second >>>> fragment. A switch expression is not an expression statement >>>> (JLS 14.8). In your second fragment, the leftmost default label >>>> is followed not by a block or a throw statement but by an >>>> expression (`switch (0) {...}`, a unary expression) and a >>>> semicolon. >>> >>> Ah, ok, we moved away slightly from the spec draft [1]. I was >>> not aware, because I haven't wrote parser by myself. The draft >>> says: >>> >>> SwitchLabeledRule: SwitchLabeledExpression SwitchLabeledBlock >>> SwitchLabeledThrowStatement >>> >>> SwitchLabeledExpression: SwitchLabel -> Expression ; >>> SwitchLabeledBlock: SwitchLabel -> Block ; >>> SwitchLabeledThrowStatement: SwitchLabel -> ThrowStatement ; >>> >>> (by the way I think that ; after block and throw should not be >>> present: current implementation does not require it after the >>> block and throw statement already includes a ; inside it). >>> >>> Instead we implement it like: >>> >>> SwitchLabeledRule: SwitchLabel -> SwitchLabeledRuleStatement >>> SwitchLabeledRuleStatement: ExpressionStatement Block >>> ThrowStatement >>> >>> So we assume that the right part of SwitchLabeledRule is always >>> a statement and reused ExpressionStatement to express Expression >>> plus semicolon, because syntactically it looks the same. Strictly >>> following a spec draft here looks even more ugly, because it >>> requires more object types in our code model and reduces the >>> flexibility when we need to perform code transformation. E.g. if >>> we want to wrap expression into block, currently we just need to >>> replace an ExpressionStatement with a Block not touching a >>> SwitchLabel at all. Had we mirrored the spec in our code model, >>> we would need to replace SwitchLabeledExpression with >>> SwitchLabeledBlock which looks more annoying. >>> >>> With best regards, Tagir Valeev >>> >>> [1] >>> http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.11 >>> > From forax at univ-mlv.fr Thu Jan 17 21:23:40 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Jan 2019 22:23:40 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: <3E0B65B1-F965-4F9A-9266-3428CA1AE2D7@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <921157978.916977.1547721170441.JavaMail.zimbra@u-pem.fr> <3E0B65B1-F965-4F9A-9266-3428CA1AE2D7@oracle.com> Message-ID: <1411732236.1075217.1547760220554.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Janvier 2019 19:20:05 > Objet: Re: Sealed types -- updated proposal >> Given that the proposal introduce the notion of sealed types, "sealed" is a >> better keyword. > Note that `sealed` already has a meaning in the context of packages (see > Package.isSealed()), though it is minor. yes, and the introduction of modules made it more or less obsolete. > Is there a different hyphenation of final other than semi-final that maintains > the connection to finality, but doesn?t weird people out? final-hierarchy, final-type, final-bound, bounded-final, final-close, final-abstract, final-transitive, final-tree, final-subtypes, super-final > (One thing I dislike about sealed / non-sealed is that now we have _contextual_ > keywords with hyphens, which wasn?t the discipline we were aiming for in the > hyphenation proposal.) lets try with other keywords: close-class/open-class, abstract-close/abstract-open, extends-close/extends-open Note that in our context, the keyword sealed/non-sealed has to be followed by either a modifier or class/interface/enum so we can allow it inside the code even if it's not pretty. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Jan 20 12:49:07 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 20 Jan 2019 13:49:07 +0100 (CET) Subject: nest syntax proposal Message-ID: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> Hi all, as Brian said recently, we have an issue because we are shortening the class declaration (with records) or wants to declare in a single compilation unit a hierarchy of types (with sealed types) and currently Java requires that a compilation unit can only have one public class. One solution is to get ride of this constraint, because it may be a good idea in 1995 but today we are writing programs that have far more classes (the introduction of modules recently was also driven by that idea). I propose another way of solving that issue, introducing a mechanism to opt-in to have more than one public class in a compilation unit. Currently we have the mechanism of nestmates which has a runtime representation (VM + reflection) but no language representation, i propose to introduce a new declaration in the language in between the package declaration and the first import, nest NestHostClass; which define the class that will be used as nest host (obviously it can be another keyword instead of "nest"). So a closed hierarchy can defines like this in one compilation unit: nest Expr; public sealed Expr permits Variable, Value, Add; public record Variable(String name) implements Expr; public record Value(int value) implements Expr; public record Add(Expr left, Expr right) implements Expr; at runtime, Variable.class.getNestHost() == Expr.class Another simpler example nest FruitBasket; public record Fruit(String name); public class FruitBasket { private final ArrayList fruits = new ArrayList<>(); public void add(Fruit fruit) { Objects.requireNonNull(fruit); fruits.add(fruit); } } at runtime, Fruit.class.getNestHost() == FruitBasket.class I believe that the nest host class defined by the keyword "nest", doesn't have to be public, but it's not a qualified name (obviously) and the class has to be defined in the compilation unit. Defining a nest can be seen as an extension of the case with only one class, if there is only one class in the compilation unit, the class is it's own nest host. If there is more than one class in the compilation unit, but only one class is public, currently, they are not nestmates, i think we should not do anything to try to retcon that compilation unit because this case is rare (one may argument that if we introduce the nest syntax, it can be more frequent). Also the compiler message should be tweaked if there are more than one public classes to say that a nest can be defined. R?mi From brian.goetz at oracle.com Sun Jan 20 17:04:00 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 20 Jan 2019 12:04:00 -0500 Subject: nest syntax proposal In-Reply-To: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> Message-ID: This is a nice example of ?today?s problems come from yesterday?s solutions.? In Java 1.1, we did nested classes, and they were pretty cool, but there were some mismatches between the language model and the runtime model, which had some sharp edges. So, as a solution to that problem, we taught the JVM about the notion of ?nest?, to align the two. The intent, at the time, was that the natural syntax for nests was ? nesting. Now, you?re saying that it kinds of stinks that we have to take all the properties of nests (shared access control, hierarchical namespace) or none of them, and you?d like to introduce a way to get the first without the second. It?s a fair idea. However, I think I?d solve the problem ? which is that it is irritating to have to say FruitBasket.Apple all the time, rather than Apple ? more directly? Like some sort of more powerful ?import?. For example: import enum Foo; could import the Foo enum class, _plus_ import-static all its constants. (And something similar for sealed classes, of course). > On Jan 20, 2019, at 7:49 AM, Remi Forax wrote: > > Hi all, > as Brian said recently, we have an issue because we are shortening the class declaration (with records) or wants to declare in a single compilation unit a hierarchy of types (with sealed types) and currently Java requires that a compilation unit can only have one public class. > > One solution is to get ride of this constraint, because it may be a good idea in 1995 but today we are writing programs that have far more classes (the introduction of modules recently was also driven by that idea). I propose another way of solving that issue, introducing a mechanism to opt-in to have more than one public class in a compilation unit. > > Currently we have the mechanism of nestmates which has a runtime representation (VM + reflection) but no language representation, i propose to introduce a new declaration in the language in between the package declaration and the first import, > nest NestHostClass; > which define the class that will be used as nest host (obviously it can be another keyword instead of "nest"). > > So a closed hierarchy can defines like this in one compilation unit: > nest Expr; > > public sealed Expr permits Variable, Value, Add; > public record Variable(String name) implements Expr; > public record Value(int value) implements Expr; > public record Add(Expr left, Expr right) implements Expr; > > at runtime, Variable.class.getNestHost() == Expr.class > > Another simpler example > nest FruitBasket; > > public record Fruit(String name); > > public class FruitBasket { > private final ArrayList fruits = new ArrayList<>(); > > public void add(Fruit fruit) { > Objects.requireNonNull(fruit); > fruits.add(fruit); > } > } > > at runtime, Fruit.class.getNestHost() == FruitBasket.class > > I believe that the nest host class defined by the keyword "nest", doesn't have to be public, but it's not a qualified name (obviously) and the class has to be defined in the compilation unit. > > Defining a nest can be seen as an extension of the case with only one class, if there is only one class in the compilation unit, the class is it's own nest host. > If there is more than one class in the compilation unit, but only one class is public, currently, they are not nestmates, i think we should not do anything to try to retcon that compilation unit because this case is rare (one may argument that if we introduce the nest syntax, it can be more frequent). Also the compiler message should be tweaked if there are more than one public classes to say that a nest can be defined. > > R?mi From forax at univ-mlv.fr Sun Jan 20 21:31:36 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 20 Jan 2019 22:31:36 +0100 (CET) Subject: nest syntax proposal In-Reply-To: References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> Message-ID: <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Dimanche 20 Janvier 2019 18:04:00 > Objet: Re: nest syntax proposal > This is a nice example of ?today?s problems come from yesterday?s solutions.? > In Java 1.1, we did nested classes, and they were pretty cool, but there were > some mismatches between the language model and the runtime model, which had > some sharp edges. So, as a solution to that problem, we taught the JVM about > the notion of ?nest?, to align the two. The intent, at the time, was that the > natural syntax for nests was ? nesting. > > Now, you?re saying that it kinds of stinks that we have to take all the > properties of nests (shared access control, hierarchical namespace) or none of > them, and you?d like to introduce a way to get the first without the second. > It?s a fair idea. Yes, i see the fact that in Java the language force classes to be enclosed to have private access as an accidental complexity. The JVM has no such requirement so i propose to reconcile the language and the VM netsmates. > > However, I think I?d solve the problem ? which is that it is irritating to have > to say FruitBasket.Apple all the time, rather than Apple ? more directly? Like > some sort of more powerful ?import?. For example: > > import enum Foo; > > could import the Foo enum class, _plus_ import-static all its constants. (And > something similar for sealed classes, of course). In a sense, you are doubling-down on the notion of hierarchy, or at least of enclosing, by saying that you have an "import tree" that can import a set of types that are declared inside another one. The main issue with your solution is that you can not retrofit an existing hierarchy/set of classes to this new scheme because moving a class inside another change its name so it's not a backward compatible change hence the idea to de-couple nestmates access and nested classes. R?mi > >> On Jan 20, 2019, at 7:49 AM, Remi Forax wrote: >> >> Hi all, >> as Brian said recently, we have an issue because we are shortening the class >> declaration (with records) or wants to declare in a single compilation unit a >> hierarchy of types (with sealed types) and currently Java requires that a >> compilation unit can only have one public class. >> >> One solution is to get ride of this constraint, because it may be a good idea in >> 1995 but today we are writing programs that have far more classes (the >> introduction of modules recently was also driven by that idea). I propose >> another way of solving that issue, introducing a mechanism to opt-in to have >> more than one public class in a compilation unit. >> >> Currently we have the mechanism of nestmates which has a runtime representation >> (VM + reflection) but no language representation, i propose to introduce a new >> declaration in the language in between the package declaration and the first >> import, >> nest NestHostClass; >> which define the class that will be used as nest host (obviously it can be >> another keyword instead of "nest"). >> >> So a closed hierarchy can defines like this in one compilation unit: >> nest Expr; >> >> public sealed Expr permits Variable, Value, Add; >> public record Variable(String name) implements Expr; >> public record Value(int value) implements Expr; >> public record Add(Expr left, Expr right) implements Expr; >> >> at runtime, Variable.class.getNestHost() == Expr.class >> >> Another simpler example >> nest FruitBasket; >> >> public record Fruit(String name); >> >> public class FruitBasket { >> private final ArrayList fruits = new ArrayList<>(); >> >> public void add(Fruit fruit) { >> Objects.requireNonNull(fruit); >> fruits.add(fruit); >> } >> } >> >> at runtime, Fruit.class.getNestHost() == FruitBasket.class >> >> I believe that the nest host class defined by the keyword "nest", doesn't have >> to be public, but it's not a qualified name (obviously) and the class has to be >> defined in the compilation unit. >> >> Defining a nest can be seen as an extension of the case with only one class, if >> there is only one class in the compilation unit, the class is it's own nest >> host. >> If there is more than one class in the compilation unit, but only one class is >> public, currently, they are not nestmates, i think we should not do anything to >> try to retcon that compilation unit because this case is rare (one may argument >> that if we introduce the nest syntax, it can be more frequent). Also the >> compiler message should be tweaked if there are more than one public classes to >> say that a nest can be defined. >> > > R?mi From brian.goetz at oracle.com Sun Jan 20 21:43:06 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 20 Jan 2019 16:43:06 -0500 Subject: nest syntax proposal In-Reply-To: <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> Message-ID: <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> So, there are about 100 people in the world who know what ?nest mates? means, which is a huge black mark against introducing an explicit ?nest? concept into the language. Nest mates has been a low-level implementation detail, and the fact that it it is hidden from the user model is a feature, not a bug. This is a big new concept to teach to a large audience that doesn?t yet have any conception of it. So far, the benefits do not remotely outweigh the degree to which it exposes new complexity to everyone. I would much prefer a solution that builds on existing concepts that people already understand, than one that requires them to learn a new concept just to support a particular use of a new feature. People understand import. And they understand auxiliary classes. Let?s work with the familiar concepts. > On Jan 20, 2019, at 4:31 PM, forax at univ-mlv.fr wrote: > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Remi Forax" >> Cc: "amber-spec-experts" >> Envoy?: Dimanche 20 Janvier 2019 18:04:00 >> Objet: Re: nest syntax proposal > >> This is a nice example of ?today?s problems come from yesterday?s solutions.? >> In Java 1.1, we did nested classes, and they were pretty cool, but there were >> some mismatches between the language model and the runtime model, which had >> some sharp edges. So, as a solution to that problem, we taught the JVM about >> the notion of ?nest?, to align the two. The intent, at the time, was that the >> natural syntax for nests was ? nesting. >> >> Now, you?re saying that it kinds of stinks that we have to take all the >> properties of nests (shared access control, hierarchical namespace) or none of >> them, and you?d like to introduce a way to get the first without the second. >> It?s a fair idea. > > Yes, i see the fact that in Java the language force classes to be enclosed to have private access as an accidental complexity. The JVM has no such requirement so i propose to reconcile the language and the VM netsmates. > >> >> However, I think I?d solve the problem ? which is that it is irritating to have >> to say FruitBasket.Apple all the time, rather than Apple ? more directly? Like >> some sort of more powerful ?import?. For example: >> >> import enum Foo; >> >> could import the Foo enum class, _plus_ import-static all its constants. (And >> something similar for sealed classes, of course). > > In a sense, you are doubling-down on the notion of hierarchy, or at least of enclosing, by saying that you have an "import tree" that can import a set of types that are declared inside another one. > > The main issue with your solution is that you can not retrofit an existing hierarchy/set of classes to this new scheme because moving a class inside another change its name so it's not a backward compatible change hence the idea to de-couple nestmates access and nested classes. > > R?mi > >> >>> On Jan 20, 2019, at 7:49 AM, Remi Forax wrote: >>> >>> Hi all, >>> as Brian said recently, we have an issue because we are shortening the class >>> declaration (with records) or wants to declare in a single compilation unit a >>> hierarchy of types (with sealed types) and currently Java requires that a >>> compilation unit can only have one public class. >>> >>> One solution is to get ride of this constraint, because it may be a good idea in >>> 1995 but today we are writing programs that have far more classes (the >>> introduction of modules recently was also driven by that idea). I propose >>> another way of solving that issue, introducing a mechanism to opt-in to have >>> more than one public class in a compilation unit. >>> >>> Currently we have the mechanism of nestmates which has a runtime representation >>> (VM + reflection) but no language representation, i propose to introduce a new >>> declaration in the language in between the package declaration and the first >>> import, >>> nest NestHostClass; >>> which define the class that will be used as nest host (obviously it can be >>> another keyword instead of "nest"). >>> >>> So a closed hierarchy can defines like this in one compilation unit: >>> nest Expr; >>> >>> public sealed Expr permits Variable, Value, Add; >>> public record Variable(String name) implements Expr; >>> public record Value(int value) implements Expr; >>> public record Add(Expr left, Expr right) implements Expr; >>> >>> at runtime, Variable.class.getNestHost() == Expr.class >>> >>> Another simpler example >>> nest FruitBasket; >>> >>> public record Fruit(String name); >>> >>> public class FruitBasket { >>> private final ArrayList fruits = new ArrayList<>(); >>> >>> public void add(Fruit fruit) { >>> Objects.requireNonNull(fruit); >>> fruits.add(fruit); >>> } >>> } >>> >>> at runtime, Fruit.class.getNestHost() == FruitBasket.class >>> >>> I believe that the nest host class defined by the keyword "nest", doesn't have >>> to be public, but it's not a qualified name (obviously) and the class has to be >>> defined in the compilation unit. >>> >>> Defining a nest can be seen as an extension of the case with only one class, if >>> there is only one class in the compilation unit, the class is it's own nest >>> host. >>> If there is more than one class in the compilation unit, but only one class is >>> public, currently, they are not nestmates, i think we should not do anything to >>> try to retcon that compilation unit because this case is rare (one may argument >>> that if we introduce the nest syntax, it can be more frequent). Also the >>> compiler message should be tweaked if there are more than one public classes to >>> say that a nest can be defined. >>> >>> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Jan 20 22:51:23 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 20 Jan 2019 23:51:23 +0100 (CET) Subject: nest syntax proposal In-Reply-To: <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> Message-ID: <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Dimanche 20 Janvier 2019 22:43:06 > Objet: Re: nest syntax proposal > So, there are about 100 people in the world who know what ?nest mates? means, > which is a huge black mark against introducing an explicit ?nest? concept into > the language. Nest mates has been a low-level implementation detail, and the > fact that it it is hidden from the user model is a feature, not a bug. This is > a big new concept to teach to a large audience that doesn?t yet have any > conception of it. So far, the benefits do not remotely outweigh the degree to > which it exposes new complexity to everyone. > I would much prefer a solution that builds on existing concepts that people > already understand, than one that requires them to learn a new concept just to > support a particular use of a new feature. yes, introducing a new concept is the weak part of my proposal. I would prefer to not introduce the concept of nest in the language and just say every classes inside the same compilation unit has nestmate access (almost an extension of the current meaning). There are two issues with that: - it's a source compatible change but dangerous because it means that an existing .java class that declares a public class and a non public class, something which is currently allowed, if recompiled the two classes will gain nestmate access. - nestmates in the VM is based on the notion of nest host, how the compiler determine which class is the nest host when they are several public classes in the compilation unit ? maybe someone has a better idea that the keyword 'nest' ? > People understand import. compiler devs and JLS maintainers don't :) import/import static is still a regular source of bugs, adding more overloaded meanings to import is something dangerous IMO. > And they understand auxiliary classes. Let?s work with the familiar concepts. but your solution seems tailored to sealed types, what if i have several record classes with no common abstract type, how it works ? and again there is no refactoring between a classical interface and a sealed interface, something we should try offer. R?mi >> On Jan 20, 2019, at 4:31 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> ----- Mail original ----- >>> De: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] > >>> ?: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Envoy?: Dimanche 20 Janvier 2019 18:04:00 >>> Objet: Re: nest syntax proposal >>> This is a nice example of ?today?s problems come from yesterday?s solutions.? >>> In Java 1.1, we did nested classes, and they were pretty cool, but there were >>> some mismatches between the language model and the runtime model, which had >>> some sharp edges. So, as a solution to that problem, we taught the JVM about >>> the notion of ?nest?, to align the two. The intent, at the time, was that the >>> natural syntax for nests was ? nesting. >>> Now, you?re saying that it kinds of stinks that we have to take all the >>> properties of nests (shared access control, hierarchical namespace) or none of >>> them, and you?d like to introduce a way to get the first without the second. >>> It?s a fair idea. >> Yes, i see the fact that in Java the language force classes to be enclosed to >> have private access as an accidental complexity. The JVM has no such >> requirement so i propose to reconcile the language and the VM netsmates. >>> However, I think I?d solve the problem ? which is that it is irritating to have >>> to say FruitBasket.Apple all the time, rather than Apple ? more directly? Like >>> some sort of more powerful ?import?. For example: >>> import enum Foo; >>> could import the Foo enum class, _plus_ import-static all its constants. (And >>> something similar for sealed classes, of course). >> In a sense, you are doubling-down on the notion of hierarchy, or at least of >> enclosing, by saying that you have an "import tree" that can import a set of >> types that are declared inside another one. >> The main issue with your solution is that you can not retrofit an existing >> hierarchy/set of classes to this new scheme because moving a class inside >> another change its name so it's not a backward compatible change hence the idea >> to de-couple nestmates access and nested classes. >> R?mi >>>> On Jan 20, 2019, at 7:49 AM, Remi Forax < [ mailto:forax at univ-mlv.fr | >>>> forax at univ-mlv.fr ] > wrote: >>>> Hi all, >>>> as Brian said recently, we have an issue because we are shortening the class >>>> declaration (with records) or wants to declare in a single compilation unit a >>>> hierarchy of types (with sealed types) and currently Java requires that a >>>> compilation unit can only have one public class. >>>> One solution is to get ride of this constraint, because it may be a good idea in >>>> 1995 but today we are writing programs that have far more classes (the >>>> introduction of modules recently was also driven by that idea). I propose >>>> another way of solving that issue, introducing a mechanism to opt-in to have >>>> more than one public class in a compilation unit. >>>> Currently we have the mechanism of nestmates which has a runtime representation >>>> (VM + reflection) but no language representation, i propose to introduce a new >>>> declaration in the language in between the package declaration and the first >>>> import, >>>> nest NestHostClass; >>>> which define the class that will be used as nest host (obviously it can be >>>> another keyword instead of "nest"). >>>> So a closed hierarchy can defines like this in one compilation unit: >>>> nest Expr; >>>> public sealed Expr permits Variable, Value, Add; >>>> public record Variable(String name) implements Expr; >>>> public record Value(int value) implements Expr; >>>> public record Add(Expr left, Expr right) implements Expr; >>>> at runtime, Variable.class.getNestHost() == Expr.class >>>> Another simpler example >>>> nest FruitBasket; >>>> public record Fruit(String name); >>>> public class FruitBasket { >>>> private final ArrayList fruits = new ArrayList<>(); >>>> public void add(Fruit fruit) { >>>> Objects.requireNonNull(fruit); >>>> fruits.add(fruit); >>>> } >>>> } >>>> at runtime, Fruit.class.getNestHost() == FruitBasket.class >>>> I believe that the nest host class defined by the keyword "nest", doesn't have >>>> to be public, but it's not a qualified name (obviously) and the class has to be >>>> defined in the compilation unit. >>>> Defining a nest can be seen as an extension of the case with only one class, if >>>> there is only one class in the compilation unit, the class is it's own nest >>>> host. >>>> If there is more than one class in the compilation unit, but only one class is >>>> public, currently, they are not nestmates, i think we should not do anything to >>>> try to retcon that compilation unit because this case is rare (one may argument >>>> that if we introduce the nest syntax, it can be more frequent). Also the >>>> compiler message should be tweaked if there are more than one public classes to >>>> say that a nest can be defined. >>>> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Jan 20 23:07:01 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 20 Jan 2019 18:07:01 -0500 Subject: nest syntax proposal In-Reply-To: <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> Message-ID: > And they understand auxiliary classes. Let?s work with the familiar concepts. > > but your solution seems tailored to sealed types, what if i have several record classes with no common abstract type, how it works ? > and again there is no refactoring between a classical interface and a sealed interface, something we should try offer. Indeed, I think there are two things here. We can choose to address one or both or neither, with the obvious tradeoffs. Issue #1 is that sealed types naturally form a family, you?re going to switch over them, etc, like enums, and we want the same (or more) nice treatment as we currently get with enums in switch. So it makes sense to define them all in one place; indeed, that should be the common case. Doing something with import helps here, but as you say, it is more specific to sealed types. On the other hand, enums and sealed types are related, so mirroring the special treatment that enums get (maybe even giving both a little more) is a low-energy-state solution. Issue #2 is that with records, one line per file starts to seem silly. This has a lot of overlap with #1, but not entirely. You could argue its a more general solution, and maybe we like that, but maybe we don?t. It surely is more intrusive ? affecting the existing semantics of auxiliary classes, and dramatically increasing the use of auxiliary classes (which, BTW, we?ve banned from the JDK source base because they make life very hard for build tooling), etc. It?s a bigger hammer. Its also possible we do nothing here, and let users nest the subtypes and clients just say ?import static SealedType.*?. That?s the least intrusive, so we should compare cost/benefit against that baseline. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Jan 26 13:24:13 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 26 Jan 2019 14:24:13 +0100 (CET) Subject: nest syntax proposal In-Reply-To: References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> Message-ID: <592592550.1353369.1548509053944.JavaMail.zimbra@u-pem.fr> I've slept on that problem this night. Let's focus first on the sealed type, You are proposing to declare the subtypes inside the sealed type, so you get nesting automatically and you can use import + a special switch rules to be able to reference a subtype directly. I believe this fell flat with a sealed class, because you are mixing inheritance and enclosing scopes. Let's take an example sealed class Expr { public void aMethod() { ... } record Value(int value) extends Expr; record Add(Expr left, Expr right) { public void anotherMethod() { aMethod(); // call super.aMethod() or Expr.this.aMethod() ? } } } as you can see, we are mixing inheritance and enclosing scope here, i don't think it's wise to let a user to write this kind of monstrosity. I wish inner classes where never invented (anonymous class are fine) but it's a too late ... We can say that records are always a static class, but refactoring a class to a record or vice-versa will create instant puzzler. We can only support sealed interface, not sealed class, it may be not a bad idea per itself, anyway i think we should no try to mix nesting and subtyping. R?mi > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 21 Janvier 2019 00:07:01 > Objet: Re: nest syntax proposal >>> And they understand auxiliary classes. Let?s work with the familiar concepts. >> but your solution seems tailored to sealed types, what if i have several record >> classes with no common abstract type, how it works ? >> and again there is no refactoring between a classical interface and a sealed >> interface, something we should try offer. > Indeed, I think there are two things here. We can choose to address one or both > or neither, with the obvious tradeoffs. > Issue #1 is that sealed types naturally form a family, you?re going to switch > over them, etc, like enums, and we want the same (or more) nice treatment as we > currently get with enums in switch. So it makes sense to define them all in one > place; indeed, that should be the common case. Doing something with import > helps here, but as you say, it is more specific to sealed types. On the other > hand, enums and sealed types are related, so mirroring the special treatment > that enums get (maybe even giving both a little more) is a low-energy-state > solution. > Issue #2 is that with records, one line per file starts to seem silly. This has > a lot of overlap with #1, but not entirely. You could argue its a more general > solution, and maybe we like that, but maybe we don?t. It surely is more > intrusive ? affecting the existing semantics of auxiliary classes, and > dramatically increasing the use of auxiliary classes (which, BTW, we?ve banned > from the JDK source base because they make life very hard for build tooling), > etc. It?s a bigger hammer. > Its also possible we do nothing here, and let users nest the subtypes and > clients just say ?import static SealedType.*?. That?s the least intrusive, so > we should compare cost/benefit against that baseline. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Jan 26 16:08:51 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 26 Jan 2019 11:08:51 -0500 Subject: nest syntax proposal In-Reply-To: <592592550.1353369.1548509053944.JavaMail.zimbra@u-pem.fr> References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> <592592550.1353369.1548509053944.JavaMail.zimbra@u-pem.fr> Message-ID: Let?s back up and look at the problem, before we jump to solutions. The sealed class mechanism, on its own, is fine. I think we have the right knobs to express the sorts of types we want. People are going to want to express sums of records. And they are going to want to do so ?conveniently.? There are several inconveniences: - Defining a sum of five records in a flat namespace requires six files, all of which may frequently be one-liners. This is annoying to write, but it is also harder to read ? if these classes are so tightly related, we want to declare them in one place. - The streamlined syntax (inferring the permits clause) is clearly aimed at supporting the above (many languages with sealed types don?t even let you declare subtypes outside of the same compilation unit.) It is better for readers and writers for simple subtypes of simple sealed types to be defined together. - We could define them as auxiliary classes, and everything would be great, except aux classes can?t be public. (Accidental interaction number 1.) - We could define them as nested classes, and everything would be great, except then clients have to deal with the nesting. (Accidental interaction number 2.) If we don?t solve this problem at all, what are people most likely going to do? Most of the time, I suspect non-API writers will write nested records, and then use import static to hide the nesting from the client view. And API writers will bite the bullet and use six files. Neither of these are terrible, but they have that whiff of friction that we know people will complain about, because they don?t like their choices. But I think we could ship the feature with no special support in any case. So we?re in the ?filing rough edges? department. > On Jan 26, 2019, at 8:24 AM, forax at univ-mlv.fr wrote: > > I've slept on that problem this night. > > Let's focus first on the sealed type, You are proposing to declare the subtypes inside the sealed type, so you get nesting automatically and you can use import + a special switch rules to be able to reference a subtype directly. > I believe this fell flat with a sealed class, because you are mixing inheritance and enclosing scopes. > > Let's take an example > sealed class Expr { > public void aMethod() { ... } > > record Value(int value) extends Expr; > record Add(Expr left, Expr right) { > public void anotherMethod() { > aMethod(); // call super.aMethod() or Expr.this.aMethod() ? > } > } > } > > as you can see, we are mixing inheritance and enclosing scope here, i don't think it's wise to let a user to write this kind of monstrosity. > > I wish inner classes where never invented (anonymous class are fine) but it's a too late ... > > We can say that records are always a static class, but refactoring a class to a record or vice-versa will create instant puzzler. > We can only support sealed interface, not sealed class, it may be not a bad idea per itself, anyway i think we should no try to mix nesting and subtyping. > > R?mi > > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 21 Janvier 2019 00:07:01 > Objet: Re: nest syntax proposal > And they understand auxiliary classes. Let?s work with the familiar concepts. > > but your solution seems tailored to sealed types, what if i have several record classes with no common abstract type, how it works ? > and again there is no refactoring between a classical interface and a sealed interface, something we should try offer. > > Indeed, I think there are two things here. We can choose to address one or both or neither, with the obvious tradeoffs. > > Issue #1 is that sealed types naturally form a family, you?re going to switch over them, etc, like enums, and we want the same (or more) nice treatment as we currently get with enums in switch. So it makes sense to define them all in one place; indeed, that should be the common case. Doing something with import helps here, but as you say, it is more specific to sealed types. On the other hand, enums and sealed types are related, so mirroring the special treatment that enums get (maybe even giving both a little more) is a low-energy-state solution. > > Issue #2 is that with records, one line per file starts to seem silly. This has a lot of overlap with #1, but not entirely. You could argue its a more general solution, and maybe we like that, but maybe we don?t. It surely is more intrusive ? affecting the existing semantics of auxiliary classes, and dramatically increasing the use of auxiliary classes (which, BTW, we?ve banned from the JDK source base because they make life very hard for build tooling), etc. It?s a bigger hammer. > > Its also possible we do nothing here, and let users nest the subtypes and clients just say ?import static SealedType.*?. That?s the least intrusive, so we should compare cost/benefit against that baseline. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Jan 26 16:54:03 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 26 Jan 2019 17:54:03 +0100 (CET) Subject: nest syntax proposal In-Reply-To: References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> <592592550.1353369.1548509053944.JavaMail.zimbra@u-pem.fr> Message-ID: <19954913.1361832.1548521643711.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Samedi 26 Janvier 2019 17:08:51 > Objet: Re: nest syntax proposal > Let?s back up and look at the problem, before we jump to solutions. > The sealed class mechanism, on its own, is fine. I think we have the right knobs > to express the sorts of types we want. > People are going to want to express sums of records. And they are going to want > to do so ?conveniently.? There are several inconveniences: > - Defining a sum of five records in a flat namespace requires six files, all of > which may frequently be one-liners. This is annoying to write, but it is also > harder to read ? if these classes are so tightly related, we want to declare > them in one place. > - The streamlined syntax (inferring the permits clause) is clearly aimed at > supporting the above (many languages with sealed types don?t even let you > declare subtypes outside of the same compilation unit.) It is better for > readers and writers for simple subtypes of simple sealed types to be defined > together. > - We could define them as auxiliary classes, and everything would be great, > except aux classes can?t be public. (Accidental interaction number 1.) > - We could define them as nested classes, and everything would be great, except > then clients have to deal with the nesting. (Accidental interaction number 2.) We can have multiple public classes in case there is a sealed type by lifting the restriction, force to have the compilation unit to have the same name as the interface and make all subtypes nestmate of the interface. For tools, it means that if you want to find the java file from a class file, if there is not corresponding java file, then you can look in the java file of the nest host of the class file. > If we don?t solve this problem at all, what are people most likely going to do? > Most of the time, I suspect non-API writers will write nested records, and then > use import static to hide the nesting from the client view. which is fine with sealed interface not sealed class ... > And API writers will bite the bullet and use six files. > Neither of these are terrible, but they have that whiff of friction that we know > people will complain about, because they don?t like their choices. But I think > we could ship the feature with no special support in any case. So we?re in the > ?filing rough edges? department. I think we can lift the "only one public class" if the compilation unit is the sealed type as described above because it's a backward compatible change. It doesn't solve the case you want several record classes with no sealed type, but at least we cover the most common case. R?mi >> On Jan 26, 2019, at 8:24 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> I've slept on that problem this night. >> Let's focus first on the sealed type, You are proposing to declare the subtypes >> inside the sealed type, so you get nesting automatically and you can use import >> + a special switch rules to be able to reference a subtype directly. >> I believe this fell flat with a sealed class, because you are mixing inheritance >> and enclosing scopes. >> Let's take an example >> sealed class Expr { >> public void aMethod() { ... } >> record Value(int value) extends Expr; >> record Add(Expr left, Expr right) { >> public void anotherMethod() { >> aMethod(); // call super.aMethod() or Expr.this.aMethod() ? >> } >> } >> } >> as you can see, we are mixing inheritance and enclosing scope here, i don't >> think it's wise to let a user to write this kind of monstrosity. >> I wish inner classes where never invented (anonymous class are fine) but it's a >> too late ... >> We can say that records are always a static class, but refactoring a class to a >> record or vice-versa will create instant puzzler. >> We can only support sealed interface, not sealed class, it may be not a bad idea >> per itself, anyway i think we should no try to mix nesting and subtyping. >> R?mi >>> De: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] > >>> ?: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Envoy?: Lundi 21 Janvier 2019 00:07:01 >>> Objet: Re: nest syntax proposal >>>>> And they understand auxiliary classes. Let?s work with the familiar concepts. >>>> but your solution seems tailored to sealed types, what if i have several record >>>> classes with no common abstract type, how it works ? >>>> and again there is no refactoring between a classical interface and a sealed >>>> interface, something we should try offer. >>> Indeed, I think there are two things here. We can choose to address one or both >>> or neither, with the obvious tradeoffs. >>> Issue #1 is that sealed types naturally form a family, you?re going to switch >>> over them, etc, like enums, and we want the same (or more) nice treatment as we >>> currently get with enums in switch. So it makes sense to define them all in one >>> place; indeed, that should be the common case. Doing something with import >>> helps here, but as you say, it is more specific to sealed types. On the other >>> hand, enums and sealed types are related, so mirroring the special treatment >>> that enums get (maybe even giving both a little more) is a low-energy-state >>> solution. >>> Issue #2 is that with records, one line per file starts to seem silly. This has >>> a lot of overlap with #1, but not entirely. You could argue its a more general >>> solution, and maybe we like that, but maybe we don?t. It surely is more >>> intrusive ? affecting the existing semantics of auxiliary classes, and >>> dramatically increasing the use of auxiliary classes (which, BTW, we?ve banned >>> from the JDK source base because they make life very hard for build tooling), >>> etc. It?s a bigger hammer. >>> Its also possible we do nothing here, and let users nest the subtypes and >>> clients just say ?import static SealedType.*?. That?s the least intrusive, so >>> we should compare cost/benefit against that baseline. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Jan 26 17:02:47 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 26 Jan 2019 18:02:47 +0100 (CET) Subject: nest syntax proposal In-Reply-To: References: <1333078085.412872.1547988547419.JavaMail.zimbra@u-pem.fr> <144615202.2355.1548019896571.JavaMail.zimbra@u-pem.fr> <5E82EF86-7F77-4114-A5A2-D38AF932ED55@oracle.com> <1941659492.8313.1548024683182.JavaMail.zimbra@u-pem.fr> <592592550.1353369.1548509053944.JavaMail.zimbra@u-pem.fr> Message-ID: <712797525.1362310.1548522167297.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Samedi 26 Janvier 2019 17:08:51 > Objet: Re: nest syntax proposal > Let?s back up and look at the problem, before we jump to solutions. > The sealed class mechanism, on its own, is fine. I think we have the right knobs > to express the sorts of types we want. > People are going to want to express sums of records. And they are going to want > to do so ?conveniently.? There are several inconveniences: > - Defining a sum of five records in a flat namespace requires six files, all of > which may frequently be one-liners. This is annoying to write, but it is also > harder to read ? if these classes are so tightly related, we want to declare > them in one place. > - The streamlined syntax (inferring the permits clause) is clearly aimed at > supporting the above (many languages with sealed types don?t even let you > declare subtypes outside of the same compilation unit.) It is better for > readers and writers for simple subtypes of simple sealed types to be defined > together. > - We could define them as auxiliary classes, and everything would be great, > except aux classes can?t be public. (Accidental interaction number 1.) > - We could define them as nested classes, and everything would be great, except > then clients have to deal with the nesting. (Accidental interaction number 2.) The other solution, again only for the sealed type case is to force nesting (so you don't need permit) but flatten the subtypes when generating the classfiles. In that case, i think we should use another keyword than interface to introduce the sealed type (as you already proposed earlier). sum Expr { record Value(int value); record Add(Expr left, Expr right); } as you said, we want to define a sum type, and by not using the keyword 'interface' we have no backward compatibility issue so we can have nesting when declaring in Java but flattening of subtypes at use site. [...] R?mi >> On Jan 26, 2019, at 8:24 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> I've slept on that problem this night. >> Let's focus first on the sealed type, You are proposing to declare the subtypes >> inside the sealed type, so you get nesting automatically and you can use import >> + a special switch rules to be able to reference a subtype directly. >> I believe this fell flat with a sealed class, because you are mixing inheritance >> and enclosing scopes. >> Let's take an example >> sealed class Expr { >> public void aMethod() { ... } >> record Value(int value) extends Expr; >> record Add(Expr left, Expr right) { >> public void anotherMethod() { >> aMethod(); // call super.aMethod() or Expr.this.aMethod() ? >> } >> } >> } >> as you can see, we are mixing inheritance and enclosing scope here, i don't >> think it's wise to let a user to write this kind of monstrosity. >> I wish inner classes where never invented (anonymous class are fine) but it's a >> too late ... >> We can say that records are always a static class, but refactoring a class to a >> record or vice-versa will create instant puzzler. >> We can only support sealed interface, not sealed class, it may be not a bad idea >> per itself, anyway i think we should no try to mix nesting and subtyping. >> R?mi >>> De: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] > >>> ?: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Envoy?: Lundi 21 Janvier 2019 00:07:01 >>> Objet: Re: nest syntax proposal >>>>> And they understand auxiliary classes. Let?s work with the familiar concepts. >>>> but your solution seems tailored to sealed types, what if i have several record >>>> classes with no common abstract type, how it works ? >>>> and again there is no refactoring between a classical interface and a sealed >>>> interface, something we should try offer. >>> Indeed, I think there are two things here. We can choose to address one or both >>> or neither, with the obvious tradeoffs. >>> Issue #1 is that sealed types naturally form a family, you?re going to switch >>> over them, etc, like enums, and we want the same (or more) nice treatment as we >>> currently get with enums in switch. So it makes sense to define them all in one >>> place; indeed, that should be the common case. Doing something with import >>> helps here, but as you say, it is more specific to sealed types. On the other >>> hand, enums and sealed types are related, so mirroring the special treatment >>> that enums get (maybe even giving both a little more) is a low-energy-state >>> solution. >>> Issue #2 is that with records, one line per file starts to seem silly. This has >>> a lot of overlap with #1, but not entirely. You could argue its a more general >>> solution, and maybe we like that, but maybe we don?t. It surely is more >>> intrusive ? affecting the existing semantics of auxiliary classes, and >>> dramatically increasing the use of auxiliary classes (which, BTW, we?ve banned >>> from the JDK source base because they make life very hard for build tooling), >>> etc. It?s a bigger hammer. >>> Its also possible we do nothing here, and let users nest the subtypes and >>> clients just say ?import static SealedType.*?. That?s the least intrusive, so >>> we should compare cost/benefit against that baseline. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jan 31 19:12:50 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 31 Jan 2019 14:12:50 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> Message-ID: <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> Since this seems to be the open issue that has the most divergence?. The basic problem is that the user is ?forced? to choose between a convenient and readable way to declare the classes (a group of related, short declarations in one place), and exporting a flat namespace. And the downside of a non-flat namespace is that then the client has to say Node.AddNode instead of just AddNode, which is inconvenient. So we can have what?s nice for the declaration site, or what?s nice for the use site, but not both. Here are the options that have been proposed so far: Option 1: Do nothing; just tell clients to `import static Node.*`. Option 2: Automagically import-static the nested subtypes of Node when you import sealed type Node. Option 3: Do what we do for enums: when switching on an enum of type C, allow `case` labels to omit the `C.` prefix. Option 4: Relax the rules about public auxiliary types. #1 is a viable solution, and has the benefit that it requires no incremental complexity. IDEs will help here too. So doing nothing is an acceptable choice; this should be our null hypothesis. #2 is cute, but (a) import processing is already nastier than it looks, and (b) this feels odoriferously specific to the interaction between two features. #3 Seems pretty justifiable to me; enums and sealed types are sibling constructs (both are about controlling the number of things that can be a member of the value set), so having a similar rule for both is arguably reducing gratuitous asymmetries. It is also a smaller change than #2 or #4, and likely will cover a great deal of the pain-causing situations. People who maintain large codebases (JDK, Google) are pretty down on #4; we ban auxiliary classes in the JDK in part because it makes tooling support so much harder. Google does something similar. This one strikes me as something that seems enticing at first but ultimately would cause other problems. It also shares downside (b) from #2. Given that, I?m going to cross #2 and #4 off the list for consideration, and restrict the choices to #1 and #3 (open to new ideas that have not yet been discussed.) I have a preference for #3, but could be moved off it. > On Jan 9, 2019, at 1:44 PM, Brian Goetz wrote: > > Auxilliary subtypes. With the advent of records, which allow us to define classes in a single line, the ?one class per file? rule starts to seem both a little silly, and constrain the user?s ability to put related definitions together (which may be more readable) while exporting a flat namespace in the public API. > > One way to do get there would be to relax the ?no public auxilliary classes? rule to permit for sealed classes, say: allowing public auxilliary subtypes of the primary type, if the primary type is public and sealed. > > Another would be to borrow a trick from enums; for a sealed type with nested subtypes, when you import the sealed type, you implicitly import the nested subtypes too. That way you could declare: > > semi-final interface Node { > class A implements Node { } > class B implements Node { } > } > ?but clients could import Node and then refer to A and B directly: > > switch (node) { > case A(): ... > case B(): ... > } > We do something similar for enum constants today. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Jan 31 21:07:12 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 31 Jan 2019 22:07:12 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> Message-ID: <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> You have forgotten that - if you have a sealed class (not sealed interface), using nesting has the side effect of creating inner classes. I don't know what is the policy of Google about inner classes that mix enclosing class access and inheritance but i suppose they are prohibited too. - for #4, I've proposed a simple scheme that allow tools to find the compilation unit of any auxiliary classes of a sealed type. R?mi > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Jeudi 31 Janvier 2019 20:12:50 > Objet: Re: Sealed types -- updated proposal > Since this seems to be the open issue that has the most divergence?. > The basic problem is that the user is ?forced? to choose between a convenient > and readable way to declare the classes (a group of related, short declarations > in one place), and exporting a flat namespace. And the downside of a non-flat > namespace is that then the client has to say Node.AddNode instead of just > AddNode, which is inconvenient. So we can have what?s nice for the declaration > site, or what?s nice for the use site, but not both. Here are the options that > have been proposed so far: > Option 1: Do nothing; just tell clients to `import static Node.*`. > Option 2: Automagically import-static the nested subtypes of Node when you > import sealed type Node. > Option 3: Do what we do for enums: when switching on an enum of type C, allow > `case` labels to omit the `C.` prefix. > Option 4: Relax the rules about public auxiliary types. > #1 is a viable solution, and has the benefit that it requires no incremental > complexity. IDEs will help here too. So doing nothing is an acceptable choice; > this should be our null hypothesis. > #2 is cute, but (a) import processing is already nastier than it looks, and (b) > this feels odoriferously specific to the interaction between two features. > #3 Seems pretty justifiable to me; enums and sealed types are sibling constructs > (both are about controlling the number of things that can be a member of the > value set), so having a similar rule for both is arguably reducing gratuitous > asymmetries. It is also a smaller change than #2 or #4, and likely will cover a > great deal of the pain-causing situations. > People who maintain large codebases (JDK, Google) are pretty down on #4; we ban > auxiliary classes in the JDK in part because it makes tooling support so much > harder. Google does something similar. This one strikes me as something that > seems enticing at first but ultimately would cause other problems. It also > shares downside (b) from #2. > Given that, I?m going to cross #2 and #4 off the list for consideration, and > restrict the choices to #1 and #3 (open to new ideas that have not yet been > discussed.) I have a preference for #3, but could be moved off it. >> On Jan 9, 2019, at 1:44 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> Auxilliary subtypes. With the advent of records, which allow us to define >> classes in a single line, the ?one class per file? rule starts to seem both a >> little silly, and constrain the user?s ability to put related definitions >> together (which may be more readable) while exporting a flat namespace in the >> public API. >> One way to do get there would be to relax the ?no public auxilliary classes? >> rule to permit for sealed classes, say: allowing public auxilliary subtypes of >> the primary type, if the primary type is public and sealed. >> Another would be to borrow a trick from enums; for a sealed type with nested >> subtypes, when you import the sealed type, you implicitly import the nested >> subtypes too. That way you could declare: >> semi-final interface Node { >> class A implements Node { } >> class B implements Node { } >> } >> ?but clients could import Node and then refer to A and B directly: >> switch (node) { >> case A(): ... >> case B(): ... >> } >> We do something similar for enum constants today. -------------- next part -------------- An HTML attachment was scrubbed... URL: