Enhancing Java String Literals Round 2
Guy Steele
guy.steele at oracle.com
Mon Jan 7 20:58:26 UTC 2019
> On Jan 6, 2019, at 12:43 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> . . .
> (Of these, my current favorite is using the backslash: “cooked”, “””cooked and ML-capable”, \”raw”, \”””raw and ML capable”. The use of \ suggests “the backslashes have been pre-added for you”, building on existing associations with backslash.)
>
> Are there other credible candidates that I’ve missed?
I like the idea of cooked-ness and multiline-ness being orthogonal, and this proposal captures that neatly.
But: Even though it may not be used often, I still worry about being able to support multiple levels of nesting and/or being able to incorporate ANY raw string.
There have been comments pro and con about allowing any number of double-quotes, or any odd number of double quotes. (I briefly pondered a compromise that would allow the use of one, tree or five double quotes! But this morning I grew cold on it.)
So here is a variation on your proposal. The possible cases are:
"single line, may contain escapes"
\"single line, no escapes, cannot contain a double quote"
"""multiline, may contain escapes"""
\""•multiline, no escapes, cannot contain the nonce followed by two double quotes•""
where ‘•’ represents a nonce string (the same string at each end), which has to be one of the following:
* a single printable character (possibly further limit this choice?) that is not an encloser
* a left encloser, a string of characters that could be a Java identifier, and a matching right encloser
(possibly further limit the set of enclosers and/or identifier characters that may be used?)
Note that the nonce may be a double quote character if desired. So actual examples are:
\"""multiline, no escapes, cannot contain three consecutive double quotes"""
\""/multiline, no escapes, cannot contain
a slash followed by two double quotes/""
\""[HTML]multiline, no escapes, cannot contain left bracket,
H, T, M, L, right bracket, double quote, double quote
[HTML]""
So it is always possible to include a raw string literal within another raw string literal by choosing an appropriate nonce, and yet you don’t have to choose a weird nonce in the manifold situations where you don’t need one.
One choiceis to decide that the only permitted choices for the nonce are a double quote or square brackets. No one is likely to be confused by seeing an even number (two) of double quotes into thinking they denote an empty string, because the initial ones are immediately preceded by ‘\’ and the final ones are immediately preceded by ‘]’.
Food for thought.
More information about the amber-spec-experts
mailing list