[raw-strings] Indentation problem

Tagir Valeev amaembo at gmail.com
Sat Jan 27 08:23:31 UTC 2018


Every language which implements the multiline strings has problems
with indentation. E.g. consider something like this:

public class Multiline {
  static String createHtml(String message) {
    String html = `<html>
    if (message != null) {
      html += `
      Message: `+message+`
    html += `
    return html;

Here the indentation of embedded snippet breaks the indentation of the
Java program harming its readability. The overall structure of the
method is messed with generated HTML structure. This is not just bad
indentation which could be fixed by auto-formatting feature of IDE.
You cannot fix this without throwing away a multiline string syntax
and without changing the semantics. Some people sacrifice the
semantics, namely the indentation of generated output if output
language is indentation agnostic. HTML is mostly so, unless you have a
<pre> section. So one may "fix" it like this:

public class Multiline {
  static String createHtml(String message) {
    String html = `<html>
    if (message != null) {
      html += `
            Message: `+message+`
    html += `
    return html;

Now we have broken formatting in the generated HTML, which ruins the
idea of multiline strings (why bother to generate \n in output HTML if
it looks like a mess anyways?) Moreover, the structure of Java program
now affects the output. E.g. if you add several more nested "if" or
"switch" statement, you will need to indent <p> even more.

Many languages provide library methods to handle this. E.g.
trimIndent() could be provided to remove leading spaces of every line,
but this would kill the HTML indents at all. Another possibility is to
provide a method like trimMargin() on Kotlin [1] which trims all
spaces before a special character (pipe by default) including a
special character itself.

Assuming such method exists in Java, we can rewrite our method in a
prettier way preserving both Java and HTML formatting:

public class Multiline {
  static String createHtml(String message) {
    String html = `<html>
      |  <head>
      |    <title>Message</title>
      |  </head>
      |  <body>`.trimMargin();
    if (message != null) {
      html += `
        |    <p>
        |      Message: `+message+`
        |    </p>`.trimMargin();
    html += `
      |  </body>
    return html;

This is almost nice. Even without syntax highlighting you can easily
distinguish between Java code and injected HTML code, you can indent
Java and HTML independently and HTML code does not clash with Java
code structure. The only problem is the necesity to call the
trimMargin() method. This means that original line is preserved in the
bytecode and during runtime and the trimming is processed every time
the method is called causing performance and memory handicap. This
problem could be minimized making trimMargin() a javac intrinsic.
Hoever even in this case it would be hard to enforce usage of this
method and I expect that tons of hard-to-read Java code will appear in
the wild, despite I believe that Java is about readability.

So I propose to enforce such (or similar) format on language level
instead of adding a library method like "trimMargin()". The syntax
could be formalized like this:

- Raw string starts with back-quote, ends with back-quote, as written
in draft before
- When line terminating sequence is encountered within a raw string,
the '\n' character is included into the string, and the literal is
- After the interruption any amount of whitespace or comment tokens
are allowed and ignored
- The next meaningful token must be a pipe '|'. It's a compilation
error if any other token or EOF appears before '|' except comments or
- After '|' the raw-string literal continues and may either end with
back-quote or be interrupted again with the subsequent line
terminating sequence.

Note the you don't need to especially escape the pipes within the literals.

I see some advantages with such syntax:
1. You can comment (or comment out!) a part of multiline string
without terminating it:

String sql = `SELECT * FROM table
    // Negative entry ID = deleted entry
    | WHERE entryID >= 0`;

If you want you can still make this comment a part of the query
(assuming DBMS accepts // comments):

String sql = `SELECT * FROM table
    | // Negative entry ID = deleted entry
    | WHERE entryID >= 0`;

Outcommenting code:

String html = `<div>
/*  |   <span color='red'>
    |       Error
    |   </span>*/ // single-line comments would work as well
    |   Something wrong happened

2. Looking into code fragment out of context (e.g. diff log) you
understand that you are inside a multiline literal. E.g. consider
reviewing a diff like

            | x++;
+           | if (x == 10) break;
            | foo(x);

Without pipes you could think that it's Java code without any further
consideration. But now it's clear that it's part of multiline string
(probably a JavaScript!), so this is not direct Java logic and you
should check the broader context to understand what's this literal is

3. You cannot accidentally make a big part of program a part of
multiline raw string just forgetting to close the back-quote. A
compilation error will be issued right in the next string like
"Multiline string must continue with a pipe token", not some obscure
message five screens below where the next raw string literal happens
to start.

4. IDEs will easily distinguish between in-literal indentation and
Java indentation and may allow you to adjust independently one or

In general this greatly increases the readability clearly telling you
at every line that you're not in Java, but inside something nested.
You can easily nest Java snippet into Java snippet and use multiline
raw-strings inside and still not get lost!

String javaMethod = `public void dumpHtml() {
  |  System.out.println(``<!DOCTYPE html>
  |    |<html>
  |    |  <body>
  |    |    <h1>HelloWorld!</h1>
  |    |  </body>
  |    |</html>``);

One pipe means one level inside, two pipes mean two levels inside.

The only disadvantage I see in forcing a pipe prefix is inability to
just paste a big snippet from somewhere to the middle of Java program
in a plain text editor. However any decent IDE would support automatic
addition of pipes on paste. If not, simple search-and-replace with
regex like s/^/   |/ though the pasted content will do the thing. Even
adding pipes manually is not that hard (I did this manually many times
writing this letter).

What do you think?

[1] https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/trim-margin.html

More information about the amber-spec-experts mailing list