From brian.goetz at oracle.com Fri Feb 1 14:57:11 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 1 Feb 2019 09:57:11 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> Message-ID: <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> > On Jan 31, 2019, at 4:07 PM, Remi Forax wrote: > > You have forgotten that > - if you have a sealed class (not sealed interface), using nesting has the side effect of creating inner classes. Kind of a strange way to put it. I would put it as: ?the user has the option of nesting both static and non-static classes, as is appropriate to the situation.? And, nested records ? the likely common case ? will be implicitly static. > - for #4, I've proposed a simple scheme that allow tools to find the compilation unit of any auxiliary classes of a sealed type. Everything is possible. But, it?s a question of cost vs benefit. I have come around to thinking this is a bigger hammer than the value of the benefit. And further, a rule like ?it would only be allowed for subtypes of a main sealed type? is a pretty serious design smell. If we?re going to do this, it should be all or nothing, standing on its own, but there is limited appetite for this. Aligning the treatment with enums ? which is the other source of exhaustiveness constraints in the language ? is a much cleaner move. For libraries like the JDK, we?ll almost surely bite the bullet and split into separate source files. This is an acceptable ?tax? for the JDK; we pay taxes like this all the time. There?s a range of other tradeoffs users can make. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Feb 1 16:57:55 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 1 Feb 2019 17:57:55 +0100 (CET) Subject: Sealed types -- updated proposal In-Reply-To: <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> Message-ID: <2007017780.369462.1549040275613.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Vendredi 1 F?vrier 2019 15:57:11 > Objet: Re: Sealed types -- updated proposal >> On Jan 31, 2019, at 4:07 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | >> forax at univ-mlv.fr ] > wrote: >> You have forgotten that >> - if you have a sealed class (not sealed interface), using nesting has the side >> effect of creating inner classes. > Kind of a strange way to put it. I would put it as: ?the user has the option of > nesting both static and non-static classes, as is appropriate to the > situation.? Forgetting 'static' is a very very common mistake. that's why i'm sensitive over that issue. And the resulting class is a non-sense, it's a class that inherits from a super class and delegate to the same super class. > And, nested records ? the likely common case ? will be implicitly static. yes, you're right. As a user, you still have the nasty surprise when you refactor a record to a class (to add a field by example). I think there is a common ground here, we can disallow sealed classes. I will avoid user to fell into the inner class trap and we can allow them later if we were overly cautious. >> - for #4, I've proposed a simple scheme that allow tools to find the compilation >> unit of any auxiliary classes of a sealed type. > Everything is possible. But, it?s a question of cost vs benefit. I have come > around to thinking this is a bigger hammer than the value of the benefit. And > further, a rule like ?it would only be allowed for subtypes of a main sealed > type? is a pretty serious design smell. If we?re going to do this, it should be > all or nothing, standing on its own, but there is limited appetite for this. yes, i agree, it's not perfect. > Aligning the treatment with enums ? which is the other source of exhaustiveness > constraints in the language ? is a much cleaner move. it's another clue that we should not allow sealed classes, enums like interfaces define a static context unlike classes. > For libraries like the JDK, we?ll almost surely bite the bullet and split into > separate source files. This is an acceptable ?tax? for the JDK; we pay taxes > like this all the time. There?s a range of other tradeoffs users can make. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Feb 1 16:59:44 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 1 Feb 2019 11:59:44 -0500 Subject: Sealed types -- updated proposal In-Reply-To: <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> Message-ID: Question about option #3: would it make only direct subtypes of the sealed/switched-on type available in this way, or recursively? I agree with limiting the options to #1 and #3. Between the two of them, I'm a bit on the fence. This is a good feature for enums, and this case is mostly similar, but not entirely; the chance for ambiguous simple names changes a few things. For example, when I add a new subtype to a sealed type that shares its simple name with an existing subtype, I've broken calling code that wasn't qualifying. Also just the need for users to choose whether to qualify or not is a pain. However, it is nice that it doesn't "make you" import the type which makes that simple name usable in the entire file including in places where its meaning would be less clear. Meh? On Fri, Feb 1, 2019 at 9:59 AM Brian Goetz wrote: > > > On Jan 31, 2019, at 4:07 PM, Remi Forax wrote: > > > > You have forgotten that > > - if you have a sealed class (not sealed interface), using nesting has > the side effect of creating inner classes. > > Kind of a strange way to put it. I would put it as: ?the user has the > option of nesting both static and non-static classes, as is appropriate to > the situation.? > > And, nested records ? the likely common case ? will be implicitly static. > > > - for #4, I've proposed a simple scheme that allow tools to find the > compilation unit of any auxiliary classes of a sealed type. > > Everything is possible. But, it?s a question of cost vs benefit. I have > come around to thinking this is a bigger hammer than the value of the > benefit. And further, a rule like ?it would only be allowed for subtypes > of a main sealed type? is a pretty serious design smell. If we?re going to > do this, it should be all or nothing, standing on its own, but there is > limited appetite for this. > > Aligning the treatment with enums ? which is the other source of > exhaustiveness constraints in the language ? is a much cleaner move. > > For libraries like the JDK, we?ll almost surely bite the bullet and split > into separate source files. This is an acceptable ?tax? for the JDK; we > pay taxes like this all the time. There?s a range of other tradeoffs users > can make. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Feb 1 17:11:39 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 1 Feb 2019 12:11:39 -0500 Subject: Sealed types -- updated proposal In-Reply-To: References: <1b50c161-5860-db14-41cc-9b1777257d6f@oracle.com> <50BEA2AF-6F3E-4B35-BD25-3EAF7E18966F@oracle.com> <171843747.2012.1548968832463.JavaMail.zimbra@u-pem.fr> <48D76477-C983-4E63-8BCF-5A6323819C21@oracle.com> Message-ID: <33780593-7BD4-42B3-B552-F9A58932F0E7@oracle.com> > Question about option #3: would it make only direct subtypes of the sealed/switched-on type available in this way, or recursively? There?s multiple ways to cut this, but the one that seems most aligned with enums is: do this for subtypes (direct and indirect) _that are nested directly in the sealed type itself_. More precisely: for a switch on x : T, where T is sealed, allow unqualified type/dtor patterns on cases for T.V where V <: T. > > I agree with limiting the options to #1 and #3. Between the two of them, I'm a bit on the fence. This is a good feature for enums, and this case is mostly similar, but not entirely; the chance for ambiguous simple names changes a few things. For example, when I add a new subtype to a sealed type that shares its simple name with an existing subtype, I've broken calling code that wasn't qualifying. Also just the need for users to choose whether to qualify or not is a pain. > > However, it is nice that it doesn't "make you" import the type which makes that simple name usable in the entire file including in places where its meaning would be less clear. > > Meh? > > > > On Fri, Feb 1, 2019 at 9:59 AM Brian Goetz > wrote: > > > On Jan 31, 2019, at 4:07 PM, Remi Forax > wrote: > > > > You have forgotten that > > - if you have a sealed class (not sealed interface), using nesting has the side effect of creating inner classes. > > Kind of a strange way to put it. I would put it as: ?the user has the option of nesting both static and non-static classes, as is appropriate to the situation.? > > And, nested records ? the likely common case ? will be implicitly static. > > > - for #4, I've proposed a simple scheme that allow tools to find the compilation unit of any auxiliary classes of a sealed type. > > Everything is possible. But, it?s a question of cost vs benefit. I have come around to thinking this is a bigger hammer than the value of the benefit. And further, a rule like ?it would only be allowed for subtypes of a main sealed type? is a pretty serious design smell. If we?re going to do this, it should be all or nothing, standing on its own, but there is limited appetite for this. > > Aligning the treatment with enums ? which is the other source of exhaustiveness constraints in the language ? is a much cleaner move. > > For libraries like the JDK, we?ll almost surely bite the bullet and split into separate source files. This is an acceptable ?tax? for the JDK; we pay taxes like this all the time. There?s a range of other tradeoffs users can make. > > > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Feb 1 19:25:27 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 1 Feb 2019 14:25:27 -0500 Subject: break-with In-Reply-To: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> Message-ID: I like `break-with value();` much better than `break value();` - and would also agree that that change alone shouldn't require re-previewing the feature. On Thu, Jan 17, 2019 at 11:28 AM Brian Goetz wrote: > > Being able to call this something like `break-with v` (or some other > derived keyword) would have made this all a lot simpler. (BTW, we can still > do this, since expression-switch is still in preview.) > > It seems we?re all in favor of break-with over unadorned ?break?? > > Which feeds into the bigger question about promoting expression switch to > final in 13. I don?t think this syntactic change on its own merits > re-previewing the feature; this is exactly the sort of ?feature is > finished, but we might change the paint color based on feedback? kind of > thing that the preview mechanism was intended for. > > We don?t have to make this decision quite yet, but sometime between now > and feature-freeze for 13 (June) we have to take one of the following > actions: > > - File a JEP to make it a permanent feature, possibly with changes > - File a JEP to re-preview it, possibly with changes > - Withdraw the feature > > We can continue to gather feedback on the feature and revisit later. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Feb 1 19:41:32 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 1 Feb 2019 14:41:32 -0500 Subject: break-with In-Reply-To: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> Message-ID: <4AE6810F-8392-4B3E-AEC4-42A6A94811BA@oracle.com> > It seems we?re all in favor of break-with over unadorned ?break?? Just for the record, there are possibly-complementary options in this direction. This is not a proposal, as much as putting them on the record. - Allow `break-with` as a synonym for `return` in lambdas. Using `return` was an uncomfortable choice (but this may well be locking the barn after the horse escapes.) - `break-to` as a synonym for labeled break. - `break-from for | while | do` as a synonym for ?break from the innermost control construct of this kind? (as an alternative to creating a label.) Overall I don?t find any of these terribly compelling, but perhaps this may jog some actually-good ideas. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Feb 1 19:52:16 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 1 Feb 2019 14:52:16 -0500 Subject: break-with In-Reply-To: <4AE6810F-8392-4B3E-AEC4-42A6A94811BA@oracle.com> References: <3EB44AF2-D921-4E66-867A-162F49E83E4A@oracle.com> <4AE6810F-8392-4B3E-AEC4-42A6A94811BA@oracle.com> Message-ID: On Fri, Feb 1, 2019 at 2:41 PM Brian Goetz wrote: > > > It seems we?re all in favor of break-with over unadorned ?break?? > > Just for the record, there are possibly-complementary options in this > direction. This is not a proposal, as much as putting them on the record. > > - Allow `break-with` as a synonym for `return` in lambdas. Using > `return` was an uncomfortable choice (but this may well be locking the barn > after the horse escapes.) > I don't think I agree that there is any problem with `return`. Lambdas ended up extremely similar to "concise anonymous classes (that downplay identity)". Users are generally well-served to think of them that way. It seems to me that any alternative to `return` could only confuse matters. > - `break-to` as a synonym for labeled break. > Hmm. Presumably `break` would never be "deprecated", and permanent synonyms like this are undesirable. But this does read nicely and I don't hate it. > - `break-from for | while | do` as a synonym for ?break from the > innermost control construct of this kind? (as an alternative to creating a > label.) > I don't hate this either. If we want the feature then this would be a good name for it. And you may remember our data does actually show that it's not too rare for the loop types to be different in this way. But this feature would create a bit of perverse incentive to take two nested loop headers that might have very parallel structure and arbitrarily write them differently (for vs. while) just to avoid the ugly label. > Overall I don?t find any of these terribly compelling, but perhaps this > may jog some actually-good ideas. > Same. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.laskey at oracle.com Sun Feb 10 15:43:40 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Sun, 10 Feb 2019 11:43:40 -0400 Subject: String reboot Message-ID: > > > Focus > > Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. > > The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. > > Goal > > Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; > > // > // > //

Hello World.

> // > // > // > > String html = "\n" + > " \n" + > "

Hello World.

\n" + > " \n" + > " \n" + > "\n"; > The primary reason we are having the string literal discussion is that the existing form has a few issues; > > The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). > > More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. > > Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) > > 50% solution > > Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. > > So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) > > String html = " > >

Hello World.

> > > "; > That's not so bad. If we did nothing else, we still would be better off than we were before. > > 75% solution, almost > > What problems are left? > > The foreign delimiters (quotes) have to be escaped. > > The foreign escape sequences also have to be escaped. > > And to a lesser degree, it's difficult to locate the closing delimiter. > > Fortunately, we don't have many choices for dealing with escapes; > > Backslash is Java's escape character. > > Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. > > Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. > > How about trying as the flag for escapes off; > > String html = \" > >

Hello World.

> > > "; > That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either " or " as the closing delimiter. " is preferable because then it doesn't look like an escape sequence (see Swift.) > > String html = \" > >

Hello World.

> > > "\; > The only new string rule added is to allow multi-line strings. > > Adding backslash before and after the string indicates escaping off. > > But wait > > This looks like the 75% solution; > > Builds on our cred with existing strings. > > Escape processing is orthogonal to multi-line. > > Delimiter can easily be understood to mean ?string with escapes." > > But wait. "" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. > > And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. > > The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. > > Wasn't avoiding escape sequences the goal? > > All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. > > Fixed delimiter > > If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. > > Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. > > We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. > > String html = """ > >

Hello World.

> > > """; > The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. > > String html = \""" > >

Hello World.

> > > """\; > Once you take away conflicts with the delimiter, most strings do not require escaping. > > Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. > > Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. > > Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. > > Structured delimiter > > A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. > > Using " instead of backtick addresses a). > > String html = """""" > >

Hello World.

> > > """"""; > For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. > > Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". > > Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. > > Nonce delimiter > > A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. > > String html = \HTML" > >

Hello World.

> > > "HTML\; > Summary: Can express all strings with and without escaping, but nonce can affect readability. > > Multi-line formatting > > I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). > > String html = \""" > > >

Hello World.

> > > > """\; > String html = """""" > > >

Hello World.

> > > > """"""; > String html = \HTML" > > >

Hello World.

> > > > "HTML/; > Entrees and desserts > > If we make good choices now (stay away from the oysters) we can still move on to other courses later. > > For instance; if we got up from the table with the ", """, ", """ set of delimiters, we could still introduce structured delimiters in the future; either with repeated (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like " for \\" or """"". > > Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.laskey at oracle.com Sun Feb 10 16:30:26 2019 From: james.laskey at oracle.com (James Laskey) Date: Sun, 10 Feb 2019 12:30:26 -0400 Subject: String reboot In-Reply-To: References: Message-ID: I should know better than format e-mails. Many a backslash eaten. The summary should be; >> For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". >> Sent from my iPhone > On Feb 10, 2019, at 11:43 AM, Jim Laskey wrote: > > >> >> >> Focus >> >> Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. >> >> The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. >> >> Goal >> >> Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; >> >> // >> // >> //

Hello World.

>> // >> // >> // >> >> String html = "\n" + >> " \n" + >> "

Hello World.

\n" + >> " \n" + >> " \n" + >> "\n"; >> The primary reason we are having the string literal discussion is that the existing form has a few issues; >> >> The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). >> >> More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. >> >> Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) >> >> 50% solution >> >> Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. >> >> So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) >> >> String html = " >> >>

Hello World.

>> >> >> "; >> That's not so bad. If we did nothing else, we still would be better off than we were before. >> >> 75% solution, almost >> >> What problems are left? >> >> The foreign delimiters (quotes) have to be escaped. >> >> The foreign escape sequences also have to be escaped. >> >> And to a lesser degree, it's difficult to locate the closing delimiter. >> >> Fortunately, we don't have many choices for dealing with escapes; >> >> Backslash is Java's escape character. >> >> Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. >> >> Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. >> >> How about trying as the flag for escapes off; >> >> String html = \" >> >>

Hello World.

>> >> >> "; >> That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either " or " as the closing delimiter. " is preferable because then it doesn't look like an escape sequence (see Swift.) >> >> String html = \" >> >>

Hello World.

>> >> >> "\; >> The only new string rule added is to allow multi-line strings. >> >> Adding backslash before and after the string indicates escaping off. >> >> But wait >> >> This looks like the 75% solution; >> >> Builds on our cred with existing strings. >> >> Escape processing is orthogonal to multi-line. >> >> Delimiter can easily be understood to mean ?string with escapes." >> >> But wait. "" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. >> >> And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. >> >> The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. >> >> Wasn't avoiding escape sequences the goal? >> >> All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. >> >> Fixed delimiter >> >> If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. >> >> Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. >> >> We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. >> >> String html = """ >> >>

Hello World.

>> >> >> """; >> The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. >> >> String html = \""" >> >>

Hello World.

>> >> >> """\; >> Once you take away conflicts with the delimiter, most strings do not require escaping. >> >> Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. >> >> Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. >> >> Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. >> >> Structured delimiter >> >> A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. >> >> Using " instead of backtick addresses a). >> >> String html = """""" >> >>

Hello World.

>> >> >> """"""; >> For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. >> >> Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". >> >> Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. >> >> Nonce delimiter >> >> A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. >> >> String html = \HTML" >> >>

Hello World.

>> >> >> "HTML\; >> Summary: Can express all strings with and without escaping, but nonce can affect readability. >> >> Multi-line formatting >> >> I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). >> >> String html = \""" >> >> >>

Hello World.

>> >> >> >> """\; >> String html = """""" >> >> >>

Hello World.

>> >> >> >> """"""; >> String html = \HTML" >> >> >>

Hello World.

>> >> >> >> "HTML/; >> Entrees and desserts >> >> If we make good choices now (stay away from the oysters) we can still move on to other courses later. >> >> For instance; if we got up from the table with the ", """, ", """ set of delimiters, we could still introduce structured delimiters in the future; either with repeated (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like " for \\" or """"". >> >> Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.laskey at oracle.com Sun Feb 10 18:10:03 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Sun, 10 Feb 2019 14:10:03 -0400 Subject: String reboot (plain text) In-Reply-To: References: Message-ID: Focus ===== Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. Goal ==== Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; // // //

Hello World.

// // // String html = "\n" + " \n" + "

Hello World.

\n" + " \n" + " \n" + "\n"; The primary reason we are having the string literal discussion is that the existing form has a few issues; ? The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). ? More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) 50% solution ============ Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) String html = "

Hello World.

"; That's not so bad. If we did nothing else, we still would be better off than we were before. 75% solution, almost ==================== What problems are left? ? The foreign delimiters (quotes) have to be escaped. ? The foreign escape sequences also have to be escaped. ? And to a lesser degree, it's difficult to locate the closing delimiter. Fortunately, we don't have many choices for dealing with escapes; ? Backslash is Java's escape character. ? Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. ? Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. How about trying \ as the flag for escapes off; String html = \"

Hello World.

"; That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) String html = \"

Hello World.

"\; ? The only new string rule added is to allow multi-line strings. ? Adding backslash before and after the string indicates escaping off. But wait ======== This looks like the 75% solution; ? Builds on our cred with existing strings. ? Escape processing is orthogonal to multi-line. ? Delimiter can easily be understood to mean ?string with escapes." But wait. "\nloaded" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. Wasn't avoiding escape sequences the goal? All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. Fixed delimiter =============== If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. String html = """

Hello World.

"""; The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. String html = \"""

Hello World.

"""\; Once you take away conflicts with the delimiter, most strings do not require escaping. Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. Structured delimiter ==================== A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. Using " instead of backtick addresses a). String html = """"""

Hello World.

""""""; For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. Nonce delimiter =============== A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. String html = \HTML"

Hello World.

"HTML\; Summary: Can express all strings with and without escaping, but nonce can affect readability. Multi-line formatting ===================== I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). String html = \"""

Hello World.

"""\; String html = """"""

Hello World.

""""""; String html = \HTML"

Hello World.

"HTML/; Entrees and desserts ==================== If we make good choices now (stay away from the oysters) we can still move on to other courses later. For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. > On Feb 10, 2019, at 12:30 PM, James Laskey wrote: > > I should know better than format e-mails. Many a backslash eaten. The summary should be; > >>> For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". >>> > > > > Sent from my iPhone > > On Feb 10, 2019, at 11:43 AM, Jim Laskey wrote: > >> >>> >>> >>> Focus >>> >>> Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. >>> >>> The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. >>> >>> Goal >>> >>> Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; >>> >>> // >>> // >>> //

Hello World.

>>> // >>> // >>> // >>> >>> String html = "\n" + >>> " \n" + >>> "

Hello World.

\n" + >>> " \n" + >>> " \n" + >>> "\n"; >>> >>> The primary reason we are having the string literal discussion is that the existing form has a few issues; >>> >>> ? The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). >>> >>> ? More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. >>> >>> Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) >>> >>> 50% solution >>> >>> Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. >>> >>> So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) >>> >>> String html = " >>> >>>

Hello World.

>>> >>> >>> "; >>> >>> That's not so bad. If we did nothing else, we still would be better off than we were before. >>> >>> 75% solution, almost >>> >>> What problems are left? >>> >>> ? The foreign delimiters (quotes) have to be escaped. >>> >>> ? The foreign escape sequences also have to be escaped. >>> >>> ? And to a lesser degree, it's difficult to locate the closing delimiter. >>> >>> Fortunately, we don't have many choices for dealing with escapes; >>> >>> ? Backslash is Java's escape character. >>> >>> ? Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. >>> >>> ? Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. >>> >>> How about trying as the flag for escapes off; >>> >>> String html = \" >>> >>>

Hello World.

>>> >>> >>> "; >>> >>> That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either " or " as the closing delimiter. " is preferable because then it doesn't look like an escape sequence (see Swift.) >>> >>> String html = \" >>> >>>

Hello World.

>>> >>> >>> "\; >>> >>> ? The only new string rule added is to allow multi-line strings. >>> >>> ? Adding backslash before and after the string indicates escaping off. >>> >>> But wait >>> >>> This looks like the 75% solution; >>> >>> ? Builds on our cred with existing strings. >>> >>> ? Escape processing is orthogonal to multi-line. >>> >>> ? Delimiter can easily be understood to mean ?string with escapes." >>> >>> But wait. "" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. >>> >>> And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. >>> >>> The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. >>> >>> Wasn't avoiding escape sequences the goal? >>> >>> All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. >>> >>> Fixed delimiter >>> >>> If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. >>> >>> Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. >>> >>> We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. >>> >>> String html = """ >>> >>>

Hello World.

>>> >>> >>> """; >>> >>> The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. >>> >>> String html = \""" >>> >>>

Hello World.

>>> >>> >>> """\; >>> >>> Once you take away conflicts with the delimiter, most strings do not require escaping. >>> >>> Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. >>> >>> Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. >>> >>> Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. >>> >>> Structured delimiter >>> >>> A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. >>> >>> Using " instead of backtick addresses a). >>> >>> String html = """""" >>> >>>

Hello World.

>>> >>> >>> """"""; >>> >>> For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. >>> >>> Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". >>> >>> Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. >>> >>> Nonce delimiter >>> >>> A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. >>> >>> String html = \HTML" >>> >>>

Hello World.

>>> >>> >>> "HTML\; >>> >>> Summary: Can express all strings with and without escaping, but nonce can affect readability. >>> >>> Multi-line formatting >>> >>> I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). >>> >>> String html = \""" >>> >>> >>>

Hello World.

>>> >>> >>> >>> """\; >>> >>> String html = """""" >>> >>> >>>

Hello World.

>>> >>> >>> >>> """"""; >>> >>> String html = \HTML" >>> >>> >>>

Hello World.

>>> >>> >>> >>> "HTML/; >>> >>> Entrees and desserts >>> >>> If we make good choices now (stay away from the oysters) we can still move on to other courses later. >>> >>> For instance; if we got up from the table with the ", """, ", """ set of delimiters, we could still introduce structured delimiters in the future; either with repeated (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like " for \\" or """"". >>> >>> Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. >>> From john.r.rose at oracle.com Sun Feb 10 21:05:16 2019 From: john.r.rose at oracle.com (John Rose) Date: Sun, 10 Feb 2019 13:05:16 -0800 Subject: String reboot In-Reply-To: References: Message-ID: <8BF2DB1F-E3B2-4705-B908-32F754E4809F@oracle.com> On Feb 10, 2019, at 8:30 AM, James Laskey wrote: > > I should know better than format e-mails. Many a backslash eaten. The summary should be; > >>> For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". Heh, I wondered about that. I'd like to propose a term, *for the dessert menu*, to better capture this idea of a "pseudo-nonce". I call it a "strong quote". A "strong quote" is a quote drawn from an infinite non-periodic set of candidates, all of which are reserved as options for introducing a quoted string. (Minor point #1: Really I should have said "scheme for strong quoting and unquoting". But "strong quote" sounds better and is unambiguous in context.) (Minor point #2: Strong quotes need strong unquotes. The set of unquotes needs to be correspondingly infinite, and each strong unquote needs to be derived from a corresponding quote. Unquotes could be the same strings as quotes, reversed strings, etc. They should to be distinct from something that might occur in naturally in a string body, like an escape sequence. The infinite choice of quotes is the major point.) Two important things to notice about strong quotes: 1. The infinite set is what makes the quotes strong. Any finite set can be embarrassed by asking it to quote a string which contains *all* the corresponding unquotes in the finite repertoire. Any finite choice takes us into those arguments that say, "surely we don't care about users who care to quote *that*". 2. The infinite set doesn't need to allow users to exercise some tasteless creativity, by picking nonces which (say) encode misleading intentions or irritating sentiments. The infinite set can be quite simple and regular. Two more less important things to notice: 3. The infinite set should not be just the cyclic repetition of some seed string, like a single quote character. This is why the word "non-periodic" appears above In the periodic case (alone) there are some strings which just barely miss being quotable, because you can't tell when the strong open quote ends and the string body begins. So the scheme of repeating a quote character "enough times" just barely misses being a proper strong quote scheme. 4. User creativity can be excluded even more by positing a normal form for a strongly-quoted string. In other words, although there is an infinite choice of strong quotes, the choice for a specific quote can be made mandatory, based on the context of the string to be quoted. An obvious choice is the shortest (or first) strong quote from the infinite set whose corresponding strong unquote does *not* occur in the string to be quoted. Whether a mis-quoted string is reported by an error or a warning is a further choice in user experience. The example scheme Jim noted of '\' Numeral '"' is an example of a strong quoting scheme, with the corresponding unquotes left as an exercise, including the task of avoiding collision with existing string body syntaxes. HTH ? John From john.r.rose at oracle.com Sun Feb 10 21:13:56 2019 From: john.r.rose at oracle.com (John Rose) Date: Sun, 10 Feb 2019 13:13:56 -0800 Subject: String reboot In-Reply-To: <8BF2DB1F-E3B2-4705-B908-32F754E4809F@oracle.com> References: <8BF2DB1F-E3B2-4705-B908-32F754E4809F@oracle.com> Message-ID: <47E7995A-7856-40FA-95E6-795A2969EE75@oracle.com> On Feb 10, 2019, at 1:05 PM, John Rose wrote: > > 3. The infinite set should not be just the cyclic repetition > of some seed string, like a single quote character. This > is why the word "non-periodic" appears above In the > periodic case (alone) there are some strings which just barely > miss being quotable, because you can't tell when the strong > open quote ends and the string body begins. So the scheme > of repeating a quote character "enough times" just barely > misses being a proper strong quote scheme. P.S. For those eagle-eyed and mathematically inclined: Periodicity is not exactly the right condition to avoid. The condition to avoid is more like every open-quote is an initial substring of another open-quote, in the infinite set of open-quotes. For example, initial finite sequences of the digits of pi would fail to be a set of strong quotes. Such cases are easy to avoid once periodicity has been eliminated. From forax at univ-mlv.fr Sun Feb 10 23:09:03 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 11 Feb 2019 00:09:03 +0100 (CET) Subject: String reboot (plain text) In-Reply-To: References: Message-ID: <1132278750.7646.1549840143679.JavaMail.zimbra@u-pem.fr> About the formatting rules, we can reuse the doc comment trick to use a character to specify the alignment. String html = \" " " "

Hello World.

" " " "\; i think it makes the code more readable if the nonce is several characters String html = \""" " " "

Hello World.

" " " """\; R?mi ----- Mail original ----- > De: "Jim Laskey" > ?: "amber-spec-experts" > Envoy?: Dimanche 10 F?vrier 2019 19:10:03 > Objet: Re: String reboot (plain text) > Focus > ===== > > Instead of ordering everything on the menu and immobilizing ourselves with > excessive gluttony, let?s focus our attention on the appetizer. If we plan > correctly, we'll have room for entrees and desserts later. > > The appetizer here is simplifying the injection of "foreign" language code into > Java source. Think tapas. We may well be sated by the time we?re done. > > > Goal > ==== > > Repurposing the Java String as a "foreign" code literal seems to be the most > natural and least intrusive contrivance for Java support. In fact, this is > already the case. Example; > > // > // > //

Hello World.

> // > // > // > > String html = "\n" + > " \n" + > "

Hello World.

\n" + > " \n" + > " \n" + > "\n"; > > The primary reason we are having the string literal discussion is that the > existing form has a few issues; > > ? The existing form is difficult to maintain without support from IDEs and is > prone to error. The introduction and subsequent editing of foreign code > requires additional delimiters, newlines, concatenations and escape sequences > (DNCE). > > ? More to the point, the existing form is difficult to read. The additional DNCE > obscure the underlying content of the string. > > Our aim is to come up with a DNCE lexicon that improves foreign code literal > readability and maintainability without leaving developers in a confused state; > with emphasis on reducing the E (escape sequences.) > > > 50% solution > ============ > > Where we keep running into trouble is that a choice for one part of the lexicon > spreads into the the other parts. That is, use of certain characters in the > delimiter affect which characters require escaping and which characters can be > used for escaping. > > So, let's pick off the lexicon easy bits first. Newlines, concatenations and > in-between delimiters can be implicit if we just allow strings to span multiple > lines (see Rust.) > > String html = " > >

Hello World.

> > > "; > > That's not so bad. If we did nothing else, we still would be better off than we > were before. > > > 75% solution, almost > ==================== > > What problems are left? > > ? The foreign delimiters (quotes) have to be escaped. > > ? The foreign escape sequences also have to be escaped. > > ? And to a lesser degree, it's difficult to locate the closing delimiter. > > Fortunately, we don't have many choices for dealing with escapes; > > ? Backslash is Java's escape character. > > ? Either escaping is on or is off (raw), so we need a way to flag a string as > being escaped. We could have an option to turn escaping on/off within a string, > but it has been hard to come up with examples where this might be required. > > ? Even with escaping off, we still might have to escape delimiters. Repeated > backslashes (or repeated delimiters) is the typical out. > > How about trying \ as the flag for escapes off; > > String html = \" > >

Hello World.

> > > "; > > That doesn't work because it looks like the string ends at the first quote. > Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable > because then it doesn't look like an escape sequence (see Swift.) > > String html = \" > >

Hello World.

> > > "\; > > ? The only new string rule added is to allow multi-line strings. > > ? Adding backslash before and after the string indicates escaping off. > > > But wait > ======== > > This looks like the 75% solution; > > ? Builds on our cred with existing strings. > > ? Escape processing is orthogonal to multi-line. > > ? Delimiter can easily be understood to mean ?string with escapes." > > But wait. "\nloaded" looks like it contains the end delimiter. Rats!!! Captain > we need more sequences. > > And, this is the crux of all the debate around strings. Fixed delimiters imply a > requirement for escape sequences, otherwise there is content you cannot express > as a string. > > The inverse of this implication is that if you have escape sequences you don't > need flexible delimiters. This can be reinterpreted as you only need flexible > delimiters if you want to always avoid escape sequences. > > Wasn't avoiding escape sequences the goal? > > All this brings us to the central choice we have to make before we get into the > rest of the meal. Do we go with fixed delimiter(s), structured delimiters or > nonce delimiters. > > > Fixed delimiter > =============== > > If we go with a fixed delimiter then we limit the content that can be expressed > without escape sequences. This is not totally left field. There are floating > point values we can not express in Java and types we can express but not > denote, such as anonymous class types, intersection types or capture types. > > Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we > are explicit about it. > > We could get closer to the 85% mark if we had a way to have " in our content > without escaping. Let's introduce a secondary delimiter, """. > > String html = """ > >

Hello World.

> > > """; > > The introduction of """ would allow " with the only restriction that we can not > use """ in the content without escaping. We could say that """ also means > escaping off, but then we would have no way to escape """ (\"""). Keeping > escaping as an orthogonal issue allows the best of both worlds. > > String html = \""" > >

Hello World.

> > > """\; > > Once you take away conflicts with the delimiter, most strings do not require > escaping. > > Also at this point we should note that other combinations of quotes ('''. ```, > "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. > > Summary: All strings can be expressed with fixed plus escaping, but can not > express strings containing the fixed delimiter (""") with escaping off. > > Jumping ahead: I think that stating that traditional " strings must be > single-line will be a popular restriction, even if it not needed. Then they > will think of """ as meaning multi-line. > > > Structured delimiter > ==================== > > A structured delimiter contains a repeating pattern that can be expanded to suit > a scenario. We attempted to introduce this notion with the original backtick > proposal, but that proposal was withdrawn because a) didn't want to burn the > backtick, b) developers weren't comfortable with infinitely repeating > delimiters, and c) non-expressible anomalies such as content with leading or > trailing backticks. > > Using " instead of backtick addresses a). > > String html = """""" > >

Hello World.

> > > """"""; > > For b) is there a limit where developers would be comfortable? That is, what > about a range of fixed delimiters; ", """, """", """"", """""". This is > slightly different than fixed delimiters in that it increases the combinations > of content containing delimiters. Example, """"" could allow ", """, """", ..., > Nx" for N != 5. > > Structured delimiters also differ from fixed delimiters in the fact that there > is pressure to have escaping off when N >= 3. You can always fall back to a > single ". > > Summary: Can express all strings with and without escaping. If the delimiter > length is limited the there there is still a (smaller) set of strings that can > not be expressed. > > > Nonce delimiter > =============== > > A nonce or custom delimiter allows developers to include a unique character > sequence in the delimiter. This provides a flexible delimiter without fear of > going too far. There is also the advantage/distraction of providing commentary. > > String html = \HTML" > >

Hello World.

> > > "HTML\; > > Summary: Can express all strings with and without escaping, but nonce can affect > readability. > > > Multi-line formatting > ===================== > > I left this out of the main discussion, but I think we can all agree that > formatting rules should separate the delimiters from the content. Other details > can be refined after choice of delimiter(s). > > String html = \""" > > >

Hello World.

> > > > """\; > > String html = """""" > > >

Hello World.

> > > > """"""; > > String html = \HTML" > > >

Hello World.

> > > > "HTML/; > > > Entrees and desserts > ==================== > > If we make good choices now (stay away from the oysters) we can still move on to > other courses later. > > For instance; if we got up from the table with the ", """, \", \""" set of > delimiters, we could still introduce structured delimiters in the future; > either with repeated \ (see Swift) or repeated ". We could also follow a > suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". > > Point being, we can work with a 85% solution now that we can supplement later > when we're not so hangry. > > > > > > >> On Feb 10, 2019, at 12:30 PM, James Laskey wrote: >> >> I should know better than format e-mails. Many a backslash eaten. The summary >> should be; >> >>>> For instance; if we got up from the table with the ", """, \", \""" set of >>>> delimiters, we could still introduce structured delimiters in the future; >>>> either with repeated \ (see Swift) or repeated ". We could also follow a >>>> suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". >>>> >> >> >> >> Sent from my iPhone >> >> On Feb 10, 2019, at 11:43 AM, Jim Laskey wrote: >> >>> >>>> >>>> >>>> Focus >>>> >>>> Instead of ordering everything on the menu and immobilizing ourselves with >>>> excessive gluttony, let?s focus our attention on the appetizer. If we plan >>>> correctly, we'll have room for entrees and desserts later. >>>> >>>> The appetizer here is simplifying the injection of "foreign" language code into >>>> Java source. Think tapas. We may well be sated by the time we?re done. >>>> >>>> Goal >>>> >>>> Repurposing the Java String as a "foreign" code literal seems to be the most >>>> natural and least intrusive contrivance for Java support. In fact, this is >>>> already the case. Example; >>>> >>>> // >>>> // >>>> //

Hello World.

>>>> // >>>> // >>>> // >>>> >>>> String html = "\n" + >>>> " \n" + >>>> "

Hello World.

\n" + >>>> " \n" + >>>> " \n" + >>>> "\n"; >>>> >>>> The primary reason we are having the string literal discussion is that the >>>> existing form has a few issues; >>>> >>>> ? The existing form is difficult to maintain without support from IDEs and is >>>> prone to error. The introduction and subsequent editing of foreign code >>>> requires additional delimiters, newlines, concatenations and escape sequences >>>> (DNCE). >>>> >>>> ? More to the point, the existing form is difficult to read. The additional DNCE >>>> obscure the underlying content of the string. >>>> >>>> Our aim is to come up with a DNCE lexicon that improves foreign code literal >>>> readability and maintainability without leaving developers in a confused state; >>>> with emphasis on reducing the E (escape sequences.) >>>> >>>> 50% solution >>>> >>>> Where we keep running into trouble is that a choice for one part of the lexicon >>>> spreads into the the other parts. That is, use of certain characters in the >>>> delimiter affect which characters require escaping and which characters can be >>>> used for escaping. >>>> >>>> So, let's pick off the lexicon easy bits first. Newlines, concatenations and >>>> in-between delimiters can be implicit if we just allow strings to span multiple >>>> lines (see Rust.) >>>> >>>> String html = " >>>> >>>>

Hello World.

>>>> >>>> >>>> "; >>>> >>>> That's not so bad. If we did nothing else, we still would be better off than we >>>> were before. >>>> >>>> 75% solution, almost >>>> >>>> What problems are left? >>>> >>>> ? The foreign delimiters (quotes) have to be escaped. >>>> >>>> ? The foreign escape sequences also have to be escaped. >>>> >>>> ? And to a lesser degree, it's difficult to locate the closing delimiter. >>>> >>>> Fortunately, we don't have many choices for dealing with escapes; >>>> >>>> ? Backslash is Java's escape character. >>>> >>>> ? Either escaping is on or is off (raw), so we need a way to flag a string as >>>> being escaped. We could have an option to turn escaping on/off within a string, >>>> but it has been hard to come up with examples where this might be required. >>>> >>>> ? Even with escaping off, we still might have to escape delimiters. Repeated >>>> backslashes (or repeated delimiters) is the typical out. >>>> >>>> How about trying as the flag for escapes off; >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "; >>>> >>>> That doesn't work because it looks like the string ends at the first quote. >>>> Let's try symmetry, either " or " as the closing delimiter. " is preferable >>>> because then it doesn't look like an escape sequence (see Swift.) >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "\; >>>> >>>> ? The only new string rule added is to allow multi-line strings. >>>> >>>> ? Adding backslash before and after the string indicates escaping off. >>>> >>>> But wait >>>> >>>> This looks like the 75% solution; >>>> >>>> ? Builds on our cred with existing strings. >>>> >>>> ? Escape processing is orthogonal to multi-line. >>>> >>>> ? Delimiter can easily be understood to mean ?string with escapes." >>>> >>>> But wait. "" looks like it contains the end delimiter. Rats!!! Captain we need >>>> more sequences. >>>> >>>> And, this is the crux of all the debate around strings. Fixed delimiters imply a >>>> requirement for escape sequences, otherwise there is content you cannot express >>>> as a string. >>>> >>>> The inverse of this implication is that if you have escape sequences you don't >>>> need flexible delimiters. This can be reinterpreted as you only need flexible >>>> delimiters if you want to always avoid escape sequences. >>>> >>>> Wasn't avoiding escape sequences the goal? >>>> >>>> All this brings us to the central choice we have to make before we get into the >>>> rest of the meal. Do we go with fixed delimiter(s), structured delimiters or >>>> nonce delimiters. >>>> >>>> Fixed delimiter >>>> >>>> If we go with a fixed delimiter then we limit the content that can be expressed >>>> without escape sequences. This is not totally left field. There are floating >>>> point values we can not express in Java and types we can express but not >>>> denote, such as anonymous class types, intersection types or capture types. >>>> >>>> Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we >>>> are explicit about it. >>>> >>>> We could get closer to the 85% mark if we had a way to have " in our content >>>> without escaping. Let's introduce a secondary delimiter, """. >>>> >>>> String html = """ >>>> >>>>

Hello World.

>>>> >>>> >>>> """; >>>> >>>> The introduction of """ would allow " with the only restriction that we can not >>>> use """ in the content without escaping. We could say that """ also means >>>> escaping off, but then we would have no way to escape """ (\"""). Keeping >>>> escaping as an orthogonal issue allows the best of both worlds. >>>> >>>> String html = \""" >>>> >>>>

Hello World.

>>>> >>>> >>>> """\; >>>> >>>> Once you take away conflicts with the delimiter, most strings do not require >>>> escaping. >>>> >>>> Also at this point we should note that other combinations of quotes ('''. ```, >>>> "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. >>>> >>>> Summary: All strings can be expressed with fixed plus escaping, but can not >>>> express strings containing the fixed delimiter (""") with escaping off. >>>> >>>> Jumping ahead: I think that stating that traditional " strings must be >>>> single-line will be a popular restriction, even if it not needed. Then they >>>> will think of """ as meaning multi-line. >>>> >>>> Structured delimiter >>>> >>>> A structured delimiter contains a repeating pattern that can be expanded to suit >>>> a scenario. We attempted to introduce this notion with the original backtick >>>> proposal, but that proposal was withdrawn because a) didn't want to burn the >>>> backtick, b) developers weren't comfortable with infinitely repeating >>>> delimiters, and c) non-expressible anomalies such as content with leading or >>>> trailing backticks. >>>> >>>> Using " instead of backtick addresses a). >>>> >>>> String html = """""" >>>> >>>>

Hello World.

>>>> >>>> >>>> """"""; >>>> >>>> For b) is there a limit where developers would be comfortable? That is, what >>>> about a range of fixed delimiters; ", """, """", """"", """""". This is >>>> slightly different than fixed delimiters in that it increases the combinations >>>> of content containing delimiters. Example, """"" could allow ", """, """", ..., >>>> Nx" for N != 5. >>>> >>>> Structured delimiters also differ from fixed delimiters in the fact that there >>>> is pressure to have escaping off when N >= 3. You can always fall back to a >>>> single ". >>>> >>>> Summary: Can express all strings with and without escaping. If the delimiter >>>> length is limited the there there is still a (smaller) set of strings that can >>>> not be expressed. >>>> >>>> Nonce delimiter >>>> >>>> A nonce or custom delimiter allows developers to include a unique character >>>> sequence in the delimiter. This provides a flexible delimiter without fear of >>>> going too far. There is also the advantage/distraction of providing commentary. >>>> >>>> String html = \HTML" >>>> >>>>

Hello World.

>>>> >>>> >>>> "HTML\; >>>> >>>> Summary: Can express all strings with and without escaping, but nonce can affect >>>> readability. >>>> >>>> Multi-line formatting >>>> >>>> I left this out of the main discussion, but I think we can all agree that >>>> formatting rules should separate the delimiters from the content. Other details >>>> can be refined after choice of delimiter(s). >>>> >>>> String html = \""" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> """\; >>>> >>>> String html = """""" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> """"""; >>>> >>>> String html = \HTML" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> "HTML/; >>>> >>>> Entrees and desserts >>>> >>>> If we make good choices now (stay away from the oysters) we can still move on to >>>> other courses later. >>>> >>>> For instance; if we got up from the table with the ", """, ", """ set of >>>> delimiters, we could still introduce structured delimiters in the future; >>>> either with repeated (see Swift) or repeated ". We could also follow a >>>> suggestion John made to use a pseudo nonce like " for \\" or """"". >>>> >>>> Point being, we can work with a 85% solution now that we can supplement later >>>> when we're not so hangry. From forax at univ-mlv.fr Wed Feb 13 00:15:39 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 13 Feb 2019 01:15:39 +0100 (CET) Subject: JEP 303 vs JEP 348 Message-ID: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> JEP 348 provides a way to replace a method call with a ldc/indy, so it's another way to implement JEP 303 (the intrinsification part) given that an Intrinsics. (resp a ldc.invokedynamic) is just a method annotated with both PolymorphicSignature and CompilerIntrinsicCandidate. class Intrinsics { @CompilerIntrinsicCandidate @PolymorphicSignature public static Object invokedynamic(BootstrapSpecifier indy, Object... args) { return null; } } So in my opinion, we should withdraw JEP 303, adds a new JEP only about the constant propagation + mirroring of ConstantDesc and add Intrinsics.ldc and Intrinsic.invokedynamic as part of JEP 348. R?mi From brian.goetz at oracle.com Wed Feb 13 16:56:20 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Feb 2019 11:56:20 -0500 Subject: JEP 303 vs JEP 348 In-Reply-To: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> References: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> Message-ID: <667178F3-AD6D-416E-ADEF-183BBBA815BC@oracle.com> I understand why you might think this, but there?s a method to our madness. Yes, both share the commonality of replacing a method call with something else. But, that?s where the similarity ends. For JEP 348, this is an opportunistic optimization; for 303, it is essential functionality (the call can?t proceed un-intrinsified.) And, for the indy() intrinsic, it simply cannot live without the constant folding and propagation. So separating those aspects is not practical. The two differ dramatically in their spec impact, as well. 348 requires little more than permission to redirect the translation for specific methods; 303 is much more deeply intrusive. > On Feb 12, 2019, at 7:15 PM, Remi Forax wrote: > > JEP 348 provides a way to replace a method call with a ldc/indy, > so it's another way to implement JEP 303 (the intrinsification part) given that an Intrinsics. (resp a ldc.invokedynamic) is just a method annotated with both PolymorphicSignature and CompilerIntrinsicCandidate. > > class Intrinsics { > @CompilerIntrinsicCandidate > @PolymorphicSignature > public static Object invokedynamic(BootstrapSpecifier indy, Object... args) { return null; } > } > > So in my opinion, we should withdraw JEP 303, adds a new JEP only about the constant propagation + mirroring of ConstantDesc and add Intrinsics.ldc and Intrinsic.invokedynamic as part of JEP 348. > > R?mi From forax at univ-mlv.fr Thu Feb 14 11:50:41 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 14 Feb 2019 12:50:41 +0100 (CET) Subject: JEP 303 vs JEP 348 In-Reply-To: <667178F3-AD6D-416E-ADEF-183BBBA815BC@oracle.com> References: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> <667178F3-AD6D-416E-ADEF-183BBBA815BC@oracle.com> Message-ID: <1832795720.54507.1550145041246.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mercredi 13 F?vrier 2019 17:56:20 > Objet: Re: JEP 303 vs JEP 348 > I understand why you might think this, but there?s a method to our madness. > > Yes, both share the commonality of replacing a method call with something else. > But, that?s where the similarity ends. no, > For JEP 348, this is an opportunistic > optimization; for 303, it is essential functionality (the call can?t proceed > un-intrinsified.) And, for the indy() intrinsic, it simply cannot live without > the constant folding and propagation. So separating those aspects is not > practical. yes, on paper, but for JEP 348 you also need to know if the arguments of an intrinsics are constant or not. For String.valueOf(), knowing if the format is constant or not allows further optimizations, for Objects.hash() if all arguments are constant then it's a ldc. It's not hypothetical, the implementation of JEP 348 already works like this. The only difference is that with JEP 348, you can fallback to an unoptimized version, while for JEP 303 you may think that having a non constant parameters for Intrinsics.ldc/invokedynamic make little sense but i beg to disagree (see below). > > The two differ dramatically in their spec impact, as well. 348 requires little > more than permission to redirect the translation for specific methods; 303 is > much more deeply intrusive. We can pull the VarHandle trick ! Instead of specifying the semantics of Intrinsics.ldc/invokedynamic in the JLS which is as you said intrusive, you can move the spec part in the javadoc by transforming the compile time error to a runtime one. If the arguments of Intrinsics.ldc/invokedynamic are not constant, the compiler can still generate an indy call that will verify at runtime that the arguments are constants (technically by verifying that methods are always called with the same instances doing pointer checks). So we have moved compile errors to runtime errors which is objectively bad but we have at the same time avoided big change of the JLS by transforming Intrinsics.ldc/invokedynamic to API point only. We may still need enhanced constant folding but that's another concern. R?mi * you can still generate an special indy for a call to Intrinsics.ldc or Intrisic.invokedynamic thats will verify at runtime that the arguments never changed, moving the compile error into a runtime error. You may think it's a stupid idea but it greatly simplify the spec. > >> On Feb 12, 2019, at 7:15 PM, Remi Forax wrote: >> >> JEP 348 provides a way to replace a method call with a ldc/indy, >> so it's another way to implement JEP 303 (the intrinsification part) given that >> an Intrinsics. (resp a ldc.invokedynamic) is just a method annotated with both >> PolymorphicSignature and CompilerIntrinsicCandidate. >> >> class Intrinsics { >> @CompilerIntrinsicCandidate >> @PolymorphicSignature >> public static Object invokedynamic(BootstrapSpecifier indy, Object... args) { >> return null; } >> } >> >> So in my opinion, we should withdraw JEP 303, adds a new JEP only about the >> constant propagation + mirroring of ConstantDesc and add Intrinsics.ldc and >> Intrinsic.invokedynamic as part of JEP 348. >> > > R?mi From brian.goetz at oracle.com Thu Feb 14 12:48:24 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Feb 2019 07:48:24 -0500 Subject: JEP 303 vs JEP 348 In-Reply-To: <1832795720.54507.1550145041246.JavaMail.zimbra@u-pem.fr> References: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> <667178F3-AD6D-416E-ADEF-183BBBA815BC@oracle.com> <1832795720.54507.1550145041246.JavaMail.zimbra@u-pem.fr> Message-ID: <1255FB73-6DE1-47EE-9A9E-C2CDFE60E062@oracle.com> >> For JEP 348, this is an opportunistic >> optimization; for 303, it is essential functionality (the call can?t proceed >> un-intrinsified.) And, for the indy() intrinsic, it simply cannot live without >> the constant folding and propagation. So separating those aspects is not >> practical. > > yes, on paper, but for JEP 348 you also need to know if the arguments of an intrinsics are constant or not. Silly Remi, you?ve gotten confused by the difference between ?constant? and ?constant? :) JEP 303 defines a broader, separate notion of ?intrinsifiable constant?, which is deliberately different from ?constant expression?. JEP 348 only needs to know about the weaker, already-present notion of constant expression. The intrinsic for indy() is useless without the broader form, but we can optimize String.format just fine without it. >> The two differ dramatically in their spec impact, as well. 348 requires little >> more than permission to redirect the translation for specific methods; 303 is >> much more deeply intrusive. > > We can pull the VarHandle trick ! > Instead of specifying the semantics of Intrinsics.ldc/invokedynamic in the JLS which is as you said intrusive, you can move the spec part in the javadoc by transforming the compile time error to a runtime one. > If the arguments of Intrinsics.ldc/invokedynamic are not constant, the compiler can still generate an indy call that will verify at runtime that the arguments are constants (technically by verifying that methods are always called with the same instances doing pointer checks). So we have moved compile errors to runtime errors which is objectively bad but we have at the same time avoided big change of the JLS by transforming Intrinsics.ldc/invokedynamic to API point only. Ah, now you reveal your actual goal ? to ship the indy() intrinsic sooner :) I really wish you?d lead with this stuff :) This is a clever trick, but results in a measurably worse feature, since with constant folding it is easy to get wrong, and getting compile-time help is pretty valuable. > We may still need enhanced constant folding but that's another concern. No, you will absolutely need enhanced folding to make it work; making an IndyRef requires making MethodTypeRefs, MethodHandleRefs, etc. (And even if we tried to special-case the ?build it all at once? case, without error messages at compile time to tell you when you messed up, it would be a very frustrating feature to use.) From forax at univ-mlv.fr Thu Feb 14 15:35:05 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 14 Feb 2019 16:35:05 +0100 (CET) Subject: JEP 303 vs JEP 348 In-Reply-To: <1255FB73-6DE1-47EE-9A9E-C2CDFE60E062@oracle.com> References: <1616777606.490271.1550016939398.JavaMail.zimbra@u-pem.fr> <667178F3-AD6D-416E-ADEF-183BBBA815BC@oracle.com> <1832795720.54507.1550145041246.JavaMail.zimbra@u-pem.fr> <1255FB73-6DE1-47EE-9A9E-C2CDFE60E062@oracle.com> Message-ID: <2040137659.125928.1550158505924.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 14 F?vrier 2019 13:48:24 > Objet: Re: JEP 303 vs JEP 348 >>> For JEP 348, this is an opportunistic >>> optimization; for 303, it is essential functionality (the call can?t proceed >>> un-intrinsified.) And, for the indy() intrinsic, it simply cannot live without >>> the constant folding and propagation. So separating those aspects is not >>> practical. >> >> yes, on paper, but for JEP 348 you also need to know if the arguments of an >> intrinsics are constant or not. > > Silly Remi, you?ve gotten confused by the difference between ?constant? and > ?constant? :) > > JEP 303 defines a broader, separate notion of ?intrinsifiable constant?, which > is deliberately different from ?constant expression?. JEP 348 only needs to > know about the weaker, already-present notion of constant expression. The goal is to provide intrinsics for ldc and indy, the notion of intrinsifiable constants is just a medium proposed by 303 to achieve that goal. I think it's a nice addition and not a core part. > > The intrinsic for indy() is useless without the broader form, but we can > optimize String.format just fine without it. "useless" is a strong word here, you are loosing the folding property, not more. > >>> The two differ dramatically in their spec impact, as well. 348 requires little >>> more than permission to redirect the translation for specific methods; 303 is >>> much more deeply intrusive. >> >> We can pull the VarHandle trick ! >> Instead of specifying the semantics of Intrinsics.ldc/invokedynamic in the JLS >> which is as you said intrusive, you can move the spec part in the javadoc by >> transforming the compile time error to a runtime one. >> If the arguments of Intrinsics.ldc/invokedynamic are not constant, the compiler >> can still generate an indy call that will verify at runtime that the arguments >> are constants (technically by verifying that methods are always called with the >> same instances doing pointer checks). So we have moved compile errors to >> runtime errors which is objectively bad but we have at the same time avoided >> big change of the JLS by transforming Intrinsics.ldc/invokedynamic to API point >> only. > > Ah, now you reveal your actual goal ? to ship the indy() intrinsic sooner :) I > really wish you?d lead with this stuff :) I think you are optimistic here, JEP 303 may never ship. Let me remember you your own argument against having a Java syntax for indy: the economics are against it, indy is used by a very few hundreds of Java devs, several orders of magnitude less than the number of people that are using Java. > > This is a clever trick, but results in a measurably worse feature, since with > constant folding it is easy to get wrong, and getting compile-time help is > pretty valuable. Better is sometimes an enemy. If you have the constant folding wrong, you will get an exception at runtime. And i believe that in most case people will store IndyRefs inside static final fields for the same reason you have java.lang.constant.ConstantDescs, so the actual meaning of "constant" may be enough. > >> We may still need enhanced constant folding but that's another concern. > > No, you will absolutely need enhanced folding to make it work; making an IndyRef > requires making MethodTypeRefs, MethodHandleRefs, etc. (And even if we tried > to special-case the ?build it all at once? case, without error messages at > compile time to tell you when you messed up, it would be a very frustrating > feature to use.) I'm fine with frustrating few people about what the feature could have been if it makes that feature being delivered*. And the true is that storing an IndyRef in a static final field is not that awful compared to what we have to do now. Moreover i believe we can still add the constant folding later after Intrinsics.invokedynamic has being delivered. It will make a runtime error a compile error which is not a backward compatible change but it's equivalent of generifying an API call used only by some happy few. R?mi * weirdly our roles are inverted in this discussion, i should be the academics guy :) From brian.goetz at oracle.com Tue Feb 19 22:50:43 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 19 Feb 2019 17:50:43 -0500 Subject: Fwd: nest syntax alternative In-Reply-To: References: Message-ID: <6ef8a518-4205-6af1-6bdb-ce303ced4333@oracle.com> Received on the -comments list. My analysis: while the package { ... } syntax acts as a nice container for multiple class units, and might well have been a nicer syntax for multiple classes in the same source file than aux classes, it's really no different than allowing aux classes to be public; we are still left with the same problem of finding the source file corresponding to com/foo/X.class, because it will not necessarily be the corresponding com/foo/X.java in the source path. -------- Forwarded Message -------- Subject: nest syntax alternative Date: Fri, 15 Feb 2019 13:19:00 +0100 From: Maarten Van Puymbroeck To: amber-spec-comments at openjdk.java.net Hello, Just thinking out loud here. Remi's proposals about the nest syntax and flattening of nested subtypes gave me the idea of "package files". This would allow all classes of a package to be defined in one file instead of a directory, making them implicitly part of the same nest without the nested class issue (if it's considered an issue). The concept would be compiled away completely. shedding a bike: package com.test.expressions { sealed class Expr {} record Value(int value) extends Expr; record Add(Expr left, Expr right) extends Expr; } Apart from the choice of nest host, this probably has other concerns. But maybe the idea might spark some other ideas... Kind regards, Maarten. (just a silent follower) -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Feb 19 23:52:32 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 19 Feb 2019 15:52:32 -0800 Subject: nest syntax alternative In-Reply-To: <6ef8a518-4205-6af1-6bdb-ce303ced4333@oracle.com> References: <6ef8a518-4205-6af1-6bdb-ce303ced4333@oracle.com> Message-ID: On Feb 19, 2019, at 2:50 PM, Brian Goetz wrote: > > ?we are still left with the same problem of finding the source file corresponding to com/foo/X.class, because it will not necessarily be the corresponding com/foo/X.java in the source path. Yes, this is a key problem. A flattened "binary name" like pkg.X or pkg.X$Y is converted to a file system query on an internal name like pkg/X.class or pkg/X$Y.class. If both classes were to be defined in one bundle of bits, then (I think) one of the following conditions must hold: A reference to either class must converge to a reference to that one *.class file, or else (given that a reference to either class internalizes as a reference to a specific classfile name unique to it) both class file names must somehow converge to locate a single copy of the bits. More concisely either this: pkg.X, pkg.Y =converge=> pkg/XY.class => bits for X, Y or this: (pkg.X => pkg/X.class, pkg.Y => pkg/Y.class) =converge=> bits for X, Y The first alternative appears to require a convergence mapping at the name level, while the second can also rely on a convergence mapping in the file system (sym. links) or in the files themselves (brief forwarding records). The first alternative seems to me to split again into two ways, depending on whether the user of a class name has a burden to record the convergence. That is, if my source code refers to X or Y, does javac place an extra bit of information that helps locate their common definition XY? Or is it the job of the JVM and other implementors of the classpath mechanism to scan definitions like XY and "register" their willingness to define both names? Leaning some more on the (odd but suggestive) term "convergence", the alternatives might be called: 1. def-site convergence 2. use-site convergence 3. class-path convergence ?based on where the primary responsibility of converging X, Y to XY occurs. Use-site convergence is actually a pretty reasonable technique for nested classes, since Java mandates that, if a compiler which translates X.Y to pkg.X$Y at the source level must *also* record that X is the definer of X.Y in the InnerClasses attribute. This gives a possible "hook" for extending class loaders to search pkg/X.class for a nearby definition of pkg.X$Y. This technique could probably be extended to associate "affiliated" classes which are not actually related by a nesting relation, but instead are located in the same source file. So use-site convergence (via some InnerClasses-like stuffing) could help guide classpath searches. It would *not* help with source-path searches, however; those would have to crawl through package folders and peek inside of source files to find hidden class declarations. In fact, the source-path mechanisms seem (to me) more resistant (than classpath mechanisms) to any notion of convergence, since we are talking about human-written source files, not classfiles which we have some control over. Nevertheless, the logic of the alternatives above applies somewhat to source-path considerations also: 1. def-site convergence = source path scanners need to peek inside all path files 2. use-site convergence = source files need an explicit "import X from Y" type statement to declare locations 3. path convergence = source paths need to be augmented with summaries (pre-compiled?) of what's stored where, perhaps rolled up in package-info.*. It seems to me we might make progress with a mixed solution: Inner classes use today's available hooks affiliated classes use forwarding pointers (def-site c.) in the file system, either sym-links if appropriate or stub classfiles which emulate sym-links on systems which lack them. E.g., the stub classfile X.class would contain a zero-length constant pool and the unqualified name of the classfile XY.class which defines X in that same package & folder. (Yep, Maarten, you sparked some musings.) ? John From forax at univ-mlv.fr Wed Feb 20 11:42:49 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 20 Feb 2019 12:42:49 +0100 (CET) Subject: nest syntax alternative In-Reply-To: References: <6ef8a518-4205-6af1-6bdb-ce303ced4333@oracle.com> Message-ID: <1399477625.735364.1550662969786.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mercredi 20 F?vrier 2019 00:52:32 > Objet: Re: nest syntax alternative > On Feb 19, 2019, at 2:50 PM, Brian Goetz wrote: >> >> ?we are still left with the same problem of finding the source file >> corresponding to com/foo/X.class, because it will not necessarily be the >> corresponding com/foo/X.java in the source path. I don't get why it's complicated. For def-site convergence, as you said we need an attribute inside X.class that says that X is inside XY, like the InnerClass attribute. For the use-site convergence, we should generate classfile of X to be flat i.e. X can be in the source file XY while still being generated as X and not X inside XY. If we don't generate X flat, it will not be compatible to seal an existing interface (it will disrupt the use site). R?mi > > Yes, this is a key problem. A flattened "binary name" like pkg.X > or pkg.X$Y is converted to a file system query on an internal > name like pkg/X.class or pkg/X$Y.class. If both classes were > to be defined in one bundle of bits, then (I think) one of the > following conditions must hold: A reference to either class > must converge to a reference to that one *.class file, or else > (given that a reference to either class internalizes as a > reference to a specific classfile name unique to it) both > class file names must somehow converge to locate a > single copy of the bits. > > More concisely either this: > > pkg.X, pkg.Y =converge=> pkg/XY.class => bits for X, Y > > or this: > > (pkg.X => pkg/X.class, pkg.Y => pkg/Y.class) =converge=> bits for X, Y > > The first alternative appears to require a convergence > mapping at the name level, while the second can also > rely on a convergence mapping in the file system (sym. > links) or in the files themselves (brief forwarding records). > > The first alternative seems to me to split again into two > ways, depending on whether the user of a class name > has a burden to record the convergence. That is, if > my source code refers to X or Y, does javac place an > extra bit of information that helps locate their common > definition XY? Or is it the job of the JVM and other > implementors of the classpath mechanism to scan > definitions like XY and "register" their willingness to > define both names? > > Leaning some more on the (odd but suggestive) term > "convergence", the alternatives might be called: > > 1. def-site convergence > 2. use-site convergence > 3. class-path convergence > > ?based on where the primary responsibility of > converging X, Y to XY occurs. > > Use-site convergence is actually a pretty reasonable > technique for nested classes, since Java mandates > that, if a compiler which translates X.Y to pkg.X$Y > at the source level must *also* record that X is the > definer of X.Y in the InnerClasses attribute. This > gives a possible "hook" for extending class loaders > to search pkg/X.class for a nearby definition of > pkg.X$Y. This technique could probably be extended > to associate "affiliated" classes which are not actually > related by a nesting relation, but instead are located > in the same source file. > > So use-site convergence (via some InnerClasses-like > stuffing) could help guide classpath searches. It would > *not* help with source-path searches, however; those > would have to crawl through package folders and peek > inside of source files to find hidden class declarations. > > In fact, the source-path mechanisms seem (to me) more > resistant (than classpath mechanisms) to any notion of > convergence, since we are talking about human-written > source files, not classfiles which we have some control > over. Nevertheless, the logic of the alternatives above > applies somewhat to source-path considerations also: > > 1. def-site convergence = source path scanners need > to peek inside all path files > 2. use-site convergence = source files need an explicit > "import X from Y" type statement to declare locations > 3. path convergence = source paths need to be augmented > with summaries (pre-compiled?) of what's stored where, > perhaps rolled up in package-info.*. > > It seems to me we might make progress with a mixed > solution: Inner classes use today's available hooks > affiliated classes use forwarding pointers (def-site c.) > in the file system, either sym-links if appropriate or > stub classfiles which emulate sym-links on systems > which lack them. E.g., the stub classfile X.class > would contain a zero-length constant pool and > the unqualified name of the classfile XY.class > which defines X in that same package & folder. > > (Yep, Maarten, you sparked some musings.) > > ? John From gavin.bierman at oracle.com Wed Feb 27 12:43:15 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Wed, 27 Feb 2019 13:43:15 +0100 Subject: Switch expressions spec In-Reply-To: <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CF3AB.5030602@oracle.com> <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> Message-ID: <2A8B1F01-52A7-4688-B4F5-CBB73826500E@oracle.com> I have uploaded a revised switch expressions spec at: http://cr.openjdk.java.net/~gbierman/switch-expressions.html This is functionally equivalent to the spec uploaded last month. The change is in how we specify the type checking of switch expressions. We have make simplifications to make it more consistent with the specification of conditional expressions. The behaviour of type checking is unchanged. Thanks, Gavin PS: I have left the January version at http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html for reference. > On 17 Jan 2019, at 10:14, Gavin Bierman wrote: > > Thank you Alex and Tagir. I have uploaded a new version of the spec at: > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > This contains all the changes you suggested below. In addition, there is a small bug fix in 5.6.3 concerning widening (https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the opportunity to reorder chapter 15 slightly, so switch expressions are now section 15.28 and constant expressions are now section 15.29 (the last section in the chapter). > > Comments welcome! > Gavin