RFR - JDK-8223775 String::stripIndent (Preview)
Alex Buckley
alex.buckley at oracle.com
Tue May 21 18:11:42 UTC 2019
On 5/21/2019 7:56 AM, Jim Laskey wrote:
> Please do a code review of the new String::stripIndent instance method.
My interest is in the API spec because the JLS will make normative
reference to it.
"Removes horizontal white space margins from the essential body of a
Text Block originated string, while preserving relative indentation."
Lots of concepts here: margins, "essential body", a "Text Block
originated string" (not just any string?). Looks like the removal
happens in-place. While "essential" is a concept from the JEP, I think
the sibling concept, "incidental", is the one to lead with. Alluding to
multiple lines is also good. Recommend:
"Returns a string whose value is this string, with incidental white
space removed from the beginning and end of every line."
That is, 'stripIndent' isn't quite as generic as its name suggests. It
implements a particular policy about "incidental" white space that we
will now explain...
New paragraph: "Incidental white space is often present in a text block
to align the content with the opening delimiter. For example, in the
following code, dots represent incidental white space:
String html = """
..............<html>
.............. <body>
.............. <p>Hello, world</p>
.............. </body>
..............</html>
..............""";
Another new para, to link this high-falutin' incidental concept to the
workmanlike name of this method: "This method treats the incidental
white space as indentation to be stripped, producing a string that
preserves the relative indentation of the content. Using | to visualize
the start of each line of the string:
|<html>
| <body>
| <p>Hello, world</p>
| </body>
|</html>
Now, the algo. "This string is first conceptually separated into lines
as if by {@link String#lines()}." -- phrasing: "First, this string is
conceptually ..."
"Then, the <i>minimum indentation</i> (min) is determined as follows.
For each non-blank line (as defined by {@link String#isBlank()}), the
leading {@link Character#isWhitespace(int) white space} characters are
counted. The leading {@link Character#isWhitespace(int) white space}
characters on the last line are are also counted even if blank.
The <i>min</i> value is the smallest of these counts."
No more "common white space prefix" like in the JEP? OK, fair enough,
we're now fully rotated to "indentation" rather than "white space".
"For each non-blank line, <i>min</i> leading white space characters are
removed. Each white space character is treated as a single character. In
particular, the tab character {@code "\t"} (U+0009) is considered a
single character; it is not expanded.
The trailing white space of each line is also removed."
The trailing white space characters are just as "treated as a single
character" as the leading white space characters. Recommend: "For each
non-blank line, <i>min</i> leading white space characters are removed,
and any trailing white space characters are removed."
Recommend handling escape sequences in a separate para. Showing "\t", a
two character sequence, and then alluding to \u0009, a single character,
is confusing.
"Each line is suffixed with a line feed character {@code "\n"} (U+000A),
except for the last line. The last line is suffixed with a line feed
character {@code "\n"} (U+000A) only if the original last line was blank."
The phrasing "except for the last line" is odd because it suggests the
last line never gets a suffix. Except, oh, it might. And "suffixed" is
an unusual word. If you're going to allude to the original string's last
line, then isn't there a way to lean on it more? e.g. "A line feed
character is appended to every line if the corresponding line in the
original string was terminated with a line feed character." (Not saying
that is correct or desirable, but is easier to visualize.)
"Finally, the lines are concatenated into a single string and returned."
Recommend: "Finally, the lines are concatenated into a string, which is
the result."
Alex
More information about the core-libs-dev
mailing list