String literals: some principles
Guy Steele
guy.steele at oracle.com
Fri May 3 20:37:36 UTC 2019
I completely agree with what you said here, John. We both took a good look, but you squinted with your right eye, and I with my left. :-) Either point of view is correct; the two together yield depth perception. Yay!
> On May 3, 2019, at 4:21 PM, John Rose <john.r.rose at oracle.com> wrote:
>
> On Apr 29, 2019, at 8:48 AM, Guy Steele <guy.steele at oracle.com> wrote:
>>
>>> On Apr 28, 2019, at 4:32 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>>
>>> . . .
>>> Looking ahead to the next round, we can build on this. In the first round, we mistakenly thought that there was something that could reasonably be called a “raw” string, but this notion is a fantasy; no string literal is so raw that it can’t recognize its closing delimiter. So “rawness” is really only a matter of degree.
>>
>> This is _almost_ true. If a string is truly raw (that is, it can contain _anything_), then one absolutely cannot depend on recognizing the closing delimiter by examining what might be the raw content.
>>
>> Put another way: one cannot determine how long the raw content is by examining it. That’s a solid principle.
>
> I'm going to be nit-picky here and refer to my earlier
> mentions of the paradigm of strong quoting, which
> at its heart simply means you have an infinite set of
> delimiters to choose from, when wrapping a payload
> into a literal syntax.
>
> Adding a numeral to the open quote means that there
> are now an unbounded set of open quotes, so it is an
> instance of strong quoting. Another instance of strong
> quoting adds nonces, and yet another just lengthens
> the quote pattern until it doesn't occur (anywhere) in
> the raw string payload.
>
> The numeric prefix convention is different from other
> kinds of strong quoting conventions, in that the end-quote
> can be a substring of the payload. Actually, the end-quote
> is most naturally the empty string, which is a substring
> of every string.
>
> The numeric prefix convention and other strong-quote
> conventions all share a common property: The convention
> as a whole is universal for arbitrary payloads, but for
> any given payload there are quotes which work and others
> that don't work. In the case of the numeric prefix
> convention, once you choose an open-quote (with
> numeral) you are limited to payloads of that length.
> That's not quite a "raw string" any more, since it's
> suitable only for a fixed-sized character field.
> Likewise, once you choose a particular nonce-based
> or patterned quote (e.g., seven double-quotes),
> payloads containing the corresponding end-quote
> as a substring are no longer suitable.
>
> Once you pick a particular payload string, the next
> question is whether you can embed that particular
> string into your program without inserting escape
> sequences. Only with a strong quote scheme of
> some sort is this possible. But, with any of several
> strong quote schemes, it is possible to dispense
> with escapes for any given string; it is not a fantasy.
>
> — John
More information about the amber-spec-experts
mailing list