[RSL] RSL update
Brian Goetz
brian.goetz at oracle.com
Mon Jun 18 15:52:31 UTC 2018
Time to take stock of where we are with respect to the multi-line aspect
of Raw String Literals. Not surprisingly, this has taken a few
iterations to get to a reasonable place. I think we're in a pretty
reasonable place right now.
The main challenge is separating "intended" indentation of multi-line
strings from the "incidental" indentation that comes from wanting the
embedded snippet to look reasonable in the context of the surrounding
code / inserted by IDEs. This, in turn, has prompted an exploration of
"what is the user thinking" (always a dangerous question) and a number
of proposed tweaks to allow the user greater control over saying "these
four spaces may look like incidental indentation, but they are in fact
intended." The earlier attempts at controlling these added complexity,
but Jim has found a way to get that effect while rolling back the
complexity.
The main transform we've been designing is what we're now calling
`align()` (previously `stripIndent()` or `trimIndent()`), which is to
remove all seemingly-incidental horizontal and vertical indentation.
This has been simplified as follows:
- Now removes all leading and trailing blank lines;
- Left-justifies remaining text based only on indentation of non-blank
lines.
The observation that led us here is: if the user wants some extra
horizontal indentation, its better to specify this explicitly (say, via
an `indent(n)` method) than to rely on significant whitespace of the
trailing line. Similarly, if the user wants extra vertical indentation,
that can be also added explicitly. We think that the current operation
now essentially finds Kevin's minimal "rectangular box", and then the
user can explicitly add back any extra indentation (horizontal or
vertical) that is desired.
There are a few reasons why this operation is important:
- Most users will not want to have to start their code "undented"
relative to the Java code, but instead will want it to embed cleanly
both horizontally and vertically;
- As the code is refactored, incidental indentation will change, and
this may cause instability in the output. Users will want a way to get
to stable output.
Raw string literals already normalize end-of-line characters. We could
describe the transformations that `align()` does as further normalizing
horizontal whitespace.
In order to make the above argument work, there needs to be an easy way
to do relative indentation, which is proposed as:
String indent(int n)
This indents a multi-line string to the left (negative n) or to the
right (positive n) by n whitespace characters. We can then define, for
convenience:
String align(int n) { return align().indent(n); }
so that users can express normalization + indentation in one go:
String s = `
blah blah
blah
`.align(4); // normalized, indented 4 chars
So there are two indentation mechanisms: relative (indent) and absolute
(align). This covers the waterfront.
Assuming we've factored this down to the appropriate primitives, the
remaining decision to be made here is: should the language try to
auto-align multi-line strings, or is asking users to explicitly use a
library method (`string.align()`) better.
Arguments in favor of the library approach:
- Many embedded languages don't care about indentation anyway (HTML,
SQL, JSON);
- The string mangling algorithm is somewhat complicated (though less
than it used to be) and subjective, both strikes against pushing it into
the language;
- If auto-alignment doesn't do what the user wants, it may be hard to
get back to what the user does want.
Arguments in favor of the language approach:
- Most usages of this feature will want alignment anyway, and having
to explicit ask for it feels like noise;
- Failure to normalize leading whitespace will mean that the
indentation of output will be perturbed by ordinary code refactoring
(which might lead to instabilities in tests);
- It is easy to explicitly specify additional horizontal or vertical
indentation if desired; normalizing the rest of the time makes results
more predictable.
Did I miss any?
(Note that arguments about constant pool efficiency or runtime
efficiency are mostly red herrings; `align()` can be safely folded at
compile time in the library approach, and a principled framework for
such transformations is in the works.)
More information about the amber-spec-experts
mailing list