[RSL] RSL update

Brian Goetz brian.goetz at oracle.com
Mon Jun 18 15:52:31 UTC 2018


Time to take stock of where we are with respect to the multi-line aspect 
of Raw String Literals.  Not surprisingly, this has taken a few 
iterations to get to a reasonable place.  I think we're in a pretty 
reasonable place right now.

The main challenge is separating "intended" indentation of multi-line 
strings from the "incidental" indentation that comes from wanting the 
embedded snippet to look reasonable in the context of the surrounding 
code / inserted by IDEs.  This, in turn, has prompted an exploration of 
"what is the user thinking" (always a dangerous question) and a number 
of proposed tweaks to allow the user greater control over saying "these 
four spaces may look like incidental indentation, but they are in fact 
intended." The earlier attempts at controlling these added complexity, 
but Jim has found a way to get that effect while rolling back the 
complexity.

The main transform we've been designing is what we're now calling 
`align()` (previously `stripIndent()` or `trimIndent()`), which is to 
remove all seemingly-incidental horizontal and vertical indentation.  
This has been simplified as follows:
  - Now removes all leading and trailing blank lines;
  - Left-justifies remaining text based only on indentation of non-blank 
lines.

The observation that led us here is: if the user wants some extra 
horizontal indentation, its better to specify this explicitly (say, via 
an `indent(n)` method) than to rely on significant whitespace of the 
trailing line.  Similarly, if the user wants extra vertical indentation, 
that can be also added explicitly.  We think that the current operation 
now essentially finds Kevin's minimal "rectangular box", and then the 
user can explicitly add back any extra indentation (horizontal or 
vertical) that is desired.

There are a few reasons why this operation is important:
  - Most users will not want to have to start their code "undented" 
relative to the Java code, but instead will want it to embed cleanly 
both horizontally and vertically;
  - As the code is refactored, incidental indentation will change, and 
this may cause instability in the output.  Users will want a way to get 
to stable output.

Raw string literals already normalize end-of-line characters.  We could 
describe the transformations that `align()` does as further normalizing 
horizontal whitespace.

In order to make the above argument work, there needs to be an easy way 
to do relative indentation, which is proposed as:

     String indent(int n)

This indents a multi-line string to the left (negative n) or to the 
right (positive n) by n whitespace characters.  We can then define, for 
convenience:

     String align(int n) { return align().indent(n); }

so that users can express normalization + indentation in one go:

     String s = `
                 blah blah
                      blah
                `.align(4); // normalized, indented 4 chars

So there are two indentation mechanisms: relative (indent) and absolute 
(align).  This covers the waterfront.


Assuming we've factored this down to the appropriate primitives, the 
remaining decision to be made here is: should the language try to 
auto-align multi-line strings, or is asking users to explicitly use a 
library method (`string.align()`) better.

Arguments in favor of the library approach:
  - Many embedded languages don't care about indentation anyway (HTML, 
SQL, JSON);
  - The string mangling algorithm is somewhat complicated (though less 
than it used to be) and subjective, both strikes against pushing it into 
the language;
  - If auto-alignment doesn't do what the user wants, it may be hard to 
get back to what the user does want.

Arguments in favor of the language approach:
  - Most usages of this feature will want alignment anyway, and having 
to explicit ask for it feels like noise;
  - Failure to normalize leading whitespace will mean that the 
indentation of output will be perturbed by ordinary code refactoring 
(which might lead to instabilities in tests);
  - It is easy to explicitly specify additional horizontal or vertical 
indentation if desired; normalizing the rest of the time makes results 
more predictable.

Did I miss any?


(Note that arguments about constant pool efficiency or runtime 
efficiency are mostly red herrings; `align()` can be safely folded at 
compile time in the library approach, and a principled framework for 
such transformations is in the works.)



More information about the amber-spec-experts mailing list