Wrapping up the first two courses
Dan Smith
daniel.smith at oracle.com
Fri May 10 21:04:08 UTC 2019
I generally like where this has landed.
I've been an uninvolved observer, and can't possibly process all the discussions on this mailing list over the past few months, so sorry if I've missed some of the core arguments on certain points. But I scanned through the various threads, and wanted to point out a couple of things in this conclusion that strike me as odd/unmotivated.
> On Apr 22, 2019, at 7:15 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
> So, I posit, we have consensus over the following things:
>
> - Multi-line strings are a useful feature on their own
> - Using “fat” delimiters for multi-line strings is practical and intuitive
There's an argument that "fat" delimiters are important because lots of use cases contain single quotes. Two thoughts on that:
- Okay, but that doesn't mean we have to prohibit "thin" delimiters, right? I have a weak preference for wanting to write multi-line strings using the standard " characters when I can get away with it. Seems more readable to me, especially for multi-line strings that aren't big chunks of marked-up text.
- What's the solution for single-line string literals that contain quotes? Fat delimiters are pretty hard to read when they're both on a single line, and I don't think the current story supports that anyway. If the solution is some "turn off escapes" mechanism, wouldn't the same mechanism work for multi-line strings?
> - There exists a reasonable alignment algorithm, which users can learn easily enough, and can be captured as a library method on String (some finer points to be hammered out)
Practically, the programming style I would want to use is Jim's example (h):
String h = """+--------+
| text |
+--------+""";
Occasionally—when the line is wide—I might want to fall back to one of the other styles (like (d)), but (h) would be my go-to.
Unfortunately, it seems like we've landed in a place where (h) is disallowed, because it can't be handled by a library method.
There have been various discussions about whether multi-line string literals are one-dimensionsal (open quote + payload + close quote) or two-dimensional (the contents of a rectangle in the editor). I think the two-dimensional model is the right abstraction—that is, drawing a rectangle should be an inherent part of parsing a multi-line string literal. The "implicitly apply a library method to this string" view is based on the one-dimensional model, where after the fact we try to approximate context of the literal and re-interpret the payload. Why tie our hands?
(Strawman: "We want a pluggable string processor." Me: "Since when is parsing supposed to be pluggable?")
As a pretty-simple definition of the 2D rectangle, I'd be happy with "all columns to the right of the opening delimiter, on all lines until the closing delimiter". Indents in between must use whitespace to align with the opening delimiter; if they don't, that's a parse error.
I realize that some people prefer a different style, and that this story is complicated by tab characters and variable-width fonts. So maybe there's another rule (or two) for the 2D rectangle when the first line is blank, based on the placement of the closing delimiter, or based on the leftmost non-whitespace character. But my high-level point is that I'd rather not force the algorithm to be defined on a context-free String.
> - To the extent the language performs alignment, it should be consistent with what the library-based version does, so that users can opt out and opt back in again
> - There needs to be an opt-out, for the cases where alignment is not the default the user wants
I want to say that, again relying on the 2D program text the parser is working with, the algorithm should be designed so that delimiters can be placed in a way to naturally indicate no trimming should occur. E.g., end delimiter in column 0 (sorry, case (d)). Others have suggested something along those lines. I don't know if you'd call that an "opt out", but the best opt-outs are the ones that don't need special syntax or rules.
(That's another reason my preferred style (h) doesn't work for everybody, because it requires at least 3 characters of indentation.)
More information about the amber-spec-experts
mailing list