multiline: Must we waste one of the final few 'free' symbols on this?
Brian Goetz
brian.goetz at oracle.com
Thu Mar 1 19:33:54 UTC 2018
And, to close the syntactic-real-estate-stewardship loop: Using ` for
exotic identifiers has a far worse return-on-syntax than using it for
string literals. The former will only be used by 0.1% of developers,
0.1% of the time; the latter are used by every developer, all of the
time. One of the reasons we didn't use ` for exotic identifiers is that
we felt the return-on-syntax was too poor, so we saved it for something
better to come along. And this is that something better.
On 3/1/2018 2:29 PM, forax at univ-mlv.fr wrote:
> Hi Cyrill,
> you can even use 2+ single quotes instead of 3+ double quotes.
>
> To answer to the question, "maybe one day exotic identifiers may want to use backticks" is not an argument for good reasons:
> - you can use the same symbol for both, it depends exactly where you allow exotic identifiers, by example if it's for exported symbols; name of methods, fields, classes that are visible outside; there is no problem because you can consider raw string as local identifier the same way the grammar considers 'module' or 'exports' as a keyword depending on the context.
> - we can use """ as you suggest (or '') to implement exotic identifiers.
> - as ma grand'ma was saying when i was starting my sentence with 'what if', the future paralyse only the fool :)
>
> regards,
> Rémi
>
> ----- Mail original -----
>> De: "Cyrill Brunner" <cyrill.brunner at hispeed.ch>
>> À: "Remi Forax" <forax at univ-mlv.fr>, "Reinier Zwitserloot" <reinier at zwitserloot.com>
>> Cc: "amber-dev" <amber-dev at openjdk.java.net>
>> Envoyé: Jeudi 1 Mars 2018 19:16:19
>> Objet: Re: multiline: Must we waste one of the final few 'free' symbols on this?
>> The points raised in that discussion are valid reasons for not using fixed,
>> possibly single-letter delimiters for raw strings, yes.
>> But for the second suggestion made, would it not also be a possible to use 3+,
>> symmetric double quotes? It would leave the use case of `while` etc. open
>> whilst producing none of the detriments mentioned in the JEP directly.
>>
>> So instead of
>>
>> String text = ``This contains a backtick: `.``
>>
>> you could similarly do
>>
>> String text = """"This string contains a triple quote: """.""""
>>
>> This would leave ` untouched for now, just as string"prefixes", whilst still
>> allowing the arbitrary-but-symmetric number of double quotes.
>>
>> What would speak against this?
>>
>> - Cyrill Brunner
>>
>>
>> Am 28.02.2018 um 14:24 schrieb Remi Forax:
>>> see
>>> http://mail.openjdk.java.net/pipermail/amber-spec-experts/2018-February/000286.html
>>>
>>> Rémi
>>>
>>> ----- Mail original -----
>>>> De: "Reinier Zwitserloot" <reinier at zwitserloot.com>
>>>> À: "amber-dev" <amber-dev at openjdk.java.net>
>>>> Envoyé: Mercredi 28 Février 2018 07:06:20
>>>> Objet: multiline: Must we waste one of the final few 'free' symbols on this?
>>>> Some feedback on multiline string literals. Where 'proposal' is referenced,
>>>> it refers to: https://bugs.openjdk.java.net/browse/JDK-8196004
>>>>
>>>> # Must we waste one of the final few 'free' symbols on this? #
>>>>
>>>> If you look at all easily accessible symbols on a keyboard, the only ones
>>>> that don't yet have a syntactic meaning in java source files are the
>>>> backtick and the hash. Everything else is either defined to be an
>>>> identifierpart which makes using them as a symbol somewhat difficult
>>>> (that'd be the underscore and the dollar, although the underscore has
>>>> already backwards-incompatibly been torn out; presumably the dollar can be
>>>> 'rescued' in the same fashion). Is THIS what we're going to spend one of
>>>> our final 2 to 3 symbols on?
>>>>
>>>> One obvious alternate use for the backtick is for encoding identifiers; if
>>>> you want to name a method "while", which the JVM spec does allow you to do,
>>>> you could maybe one day use backticks. Some JVM-targeted languages already
>>>> do this. I'm not saying this is a good idea, but I am saying that
>>>> implementing the raw string literal proposal as written pretty much
>>>> eliminates this notion from ever seeing the light of day, forever. Perhaps
>>>> it's worth some debate before we just casually close that door in
>>>> perpetuity.
>>>>
>>>> alternatives:
>>>>
>>>> Is: R"This is a raw string" an option? An advantage to the 'R' concept is
>>>> that you can separate 'escapes arent processed' ('raw') from 'feel free to
>>>> newline in these' ('multiline'): The R indicates raw, and hitting enter
>>>> immediately after the quote indicates multiline, which would be backwards
>>>> compatible as currently its always illegal java if you newline in the
>>>> middle of string literals. Thus:
>>>>
>>>> String x = R"Escapes \t are not processed here; this contains raw
>>>> backslash-t instead of a tab";
>>>> String multi = "
>>>> This is
>>>> multiline but \t DOES contain a tab";
>>>> String rawMulti = R"
>>>> This is
>>>> multi with \t backslash-t literally, not a tab";
>>>>
>>>> Another option would be to investigate the use of triple quotes. In java9
>>>> syntax, having 3 quotes in immediate succession cannot possibly be valid in
>>>> a source file unless in a comment. Therefore, it would seem possible to use
>>>> triple quotes as a delimiter without creating the ambiguity mentioned in
>>>> the 'Choice of delimiters' section. Example:
>>>>
>>>> String regex = """Hey now I don't have to \w+ escape my backslashes!""";
>>>>
>>>> This syntax also has quite a lot of precedence (kotlin, swift, groovy, and
>>>> python). Note that the 'other languages' section misconstrues how python
>>>> works; triple quotes is for multiline strings. For raw strings, you use
>>>> R"foo". Most python programmers seem to think the R stands for regex, as
>>>> that's pretty much what they're always used for. Nevertheless, it stands
>>>> for 'raw'. See:
>>>> https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
>>>>
>>>> In regards to investigating simply allowing java strings to contain
>>>> newlines; the 'Choice of delimiters' section has this quote:
>>>>
>>>>> Enabling such a feature would affect tools and tests that assume
>>>> multi-line traditional string literals as an error.
>>>>
>>>> This makes no sense. Any unupdated tool would consider use of a backtick
>>>> also an error. Either way, tools not aware of the new feature would treat
>>>> multiline string literals as a syntax error, whether you use backtick,
>>>> quote, or triple-quote. Unlike the introduction of very fancy footwork to
>>>> treat backslash-u escapes as raw inside these literals, addition of
>>>> backtick (or triple quote, or single quote) as signifying raw and/or
>>>> multiline strings won't be particularly difficult for existing java parsers
>>>> to implement. It doesn't seem relevant as an argument for or against any
>>>> particular delimiter.
>>>>
>>>> --Reinier Zwitserloot
More information about the amber-dev
mailing list