Trailing white-space in text blocks

James Laskey james.laskey at oracle.com
Tue May 21 09:17:09 UTC 2019


Tagir,

These scenarios were considered in the decision, and I guess will be part of the rationale section of some doc TBA. 

The argument for removal of trailing white space is pretty close to the argument for normalization of line terminators. The developer needs to be able to rely on what they can not see as much as what they can see. 

As you mention, many editors strip trailing spaces on save, which would frustrate developers to no end. “Rats (expletive of choice), I need to add those (next expletive) spaces back again...”

A better option is to have some visible indication that the intent is to keep spaces at the end. Guy recommended boxing the string with long delimiters to possibly provide fixed length strings. I think something more flexible is required, say if you wanted 4 spaces at the end of each line.

If we move the line continuation discussion along, I think we have a possible candidate for keep the space indicator. 

If we go with \<LineTerminator> for line continuation, and 

    """
    First \
    Second \
    Third 
    """;

represents the string "First<Space>Second<Space>Third", then

    """
    First \n\
    Second \n\
    Third 
    """;

represents "First<Space>\nSecond<Space>\nThird". 

Elegant or ugly, this visible indication makes it clear that the space is intentional. Of course, it’s easily enough to do the same with any sequence. 

    """
    First $
    Second $
    Third
    """.replace("$\n", "\n”)

And, this works because we can rely on the fact that there is no incidental spacing after the $ and the line terminator is exactly \n. 

Cheers,

— Jim

Sent from my iPhone

> On May 21, 2019, at 12:36 AM, Tagir Valeev <amaembo at gmail.com> wrote:
> 
> Hello!
> 
> JEP 355 [1] says the following:
> 
>> Remove all trailing white space from all lines in the modified list of individual lines from step 5. (Hidden white space at the end of lines is unintentional, so it is overwhelmingly likely that the developer does notwant it in the string.) Note that this step collapses wholly-whitespace lines in the modified list so that they are empty, but does not discard them.
> 
> I'm not sure this is a good idea. Consider the following quite
> possible user story:
> 
> Suppose I wrote a method like this (just to illustrate the problem,
> exact method details don't matter):
> 
> public static String formatList(List<String> list, int maxWidth) {
>  var result = new StringBuilder();
>  int currentCount = 0;
>  for (String s : list) {
>    if (currentCount > 0 && currentCount + s.length() + 1 > maxWidth) {
>      result.append("\n");
>      currentCount = 0;
>    }
>    result.append(s).append(" ");
>    currentCount += s.length() + 1;
>  }
>  return result.toString();
> }
> 
> The method joins list of strings wrapping the result if it exceeds the
> supplied maxWidth. Note that it adds a whitespace before the
> linebreak. Probably it's accidental and could be removed, but it's
> also possible that the method exists for a long time and somebody
> already relies on these trailing whitespaces, so probably I'm not in
> the position to modify the method.
> 
> Ok, we have a method, time to unit-test it. I lazily write:
> 
> public void testFormatList() {
>  String actual = StringUtil.formatList(List.of("One", "Two", "Three",
> "Four", "Five", "Six", "Seven"), 10);
>  assertEquals("", actual);
> }
> 
> Run the unit-test, it predictably fails and shows the difference
> between expected and actual. Using IDE diff view I examine the actual
> text, it looks like this:
> 
> One Two
> Three
> Four Five
> Six Seven
> 
> I confirm that I like it, so I paste it to the expected parameter into
> my unit-test changing the assertion to
> 
> assertEquals("""
>             One Two
>             Three
>             Four Five
>             Six Seven
>             """, actual);
> 
> It doesn't matter whether we add indent on the left or not. Looks
> pretty, but the test still fails. Closer examination of the output
> shows that expected code doesn't contain the trailing white-space
> while the actual output contains. Usually people are not very
> attentive to small details. Assume that I've missed the part about
> trailing whitespace stripping (or I learned this feature from online
> tutorial where this detail wasn't mentioned). Now I'm in complete
> confusion. I just copied the actual to the expected, but lost my
> whitespaces. What happened?
> 
> - Probably my diff viewer was too smart at the copy action and
> stripped the whitespaces?
> - Probably my editor was too smart at the paste action and stripped
> the whitespaces?
> - Probably my editor was too smart during the file save operation and
> did additional cleanup which resulted in whitespace stripping? IDEA
> can actually do this, and it's a question whether this will affect
> multiline literals as well.
> - Probably compiler somehow stripped it (that's actually the case)
> - Probably I don't know something about assertEquals method? Probably
> it's buggy and strips whitespaces in one case but keeps them in
> another? I saw similar problem before with some custom assertion
> method.
> - Probably I misinterpreted the output of test failure and problem is
> not in whitespaces? The diff viewer of my IDE just highlights the
> single whitespace at the end of the every line, but I'm not sure,
> probably this highlighting means that something wrong with line
> separators?
> 
> Things could be much worse if I was too lazy to rerun the test locally
> (what for? I just copied actual to expected! What could go wrong? Come
> on, just commit this and move to the next task). Then CI build fails
> and I have more possiblities:
> 
> - Probably Git commit does something special with trailing whitespaces
> (e.g. we have pre-commit hook)
> - Probably CI pulls changes in some special way
> - Probably CI has different environment (JVM version, OS, filesystem)
> and this causes failure? Also CI output could be not very clear when
> the difference is in whitespaces only, and again I'm not sure what's
> going on.
> 
> So I have many things to check. If I'm experienced, I would probably
> open the .java file in hex viewer and check the actual bytes to ensure
> that the whitespaces are there, thus first three cases are ruled out,
> then probably read the spec. Less experienced developer would be
> completely lost.
> 
> Nevertheless after a hour or two, probably asking a colleague to help
> I will find the cause: the problem is really caused by the compiler.
> So I will spend another hour to find how to work-around it. I will see
> that escapes are handled after whitespace removal, thus will try to
> replace trailing spaces with \u0020 (very few developers are aware
> that it's not an actual string literal escape sequence). I will
> probably consult StackOverflow and find nothing. At the end I will
> give up and return to good old plain string literal. Not very
> productive day.
> 
> I'm not sure how to avoid such scenario if trailing whitespaces are
> actually stripped. I can think up some solutions how IDE could help
> pointing at the actual problem cause, but not every IDE could be smart
> enough to help user in such scenario.
> 
> I think that trailing whitespace stripping should be reconsidered.
> 
> With best regards,
> Tagir Valeev
> 
> [1] https://openjdk.java.net/jeps/355



More information about the amber-spec-observers mailing list