Long line string literals

Jim Laskey james.laskey at oracle.com
Thu May 9 14:34:48 UTC 2019


How does a Java developer express a very long string? Note that this is
not just a multi-line string literal question. The issue relates to all
string literals.

Example,

    String ls = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc est libero, vehicula nec molestie in, semper aliquam magna.";

Current solution,

    String ls = "Lorem ipsum dolor sit amet, consectetur " +
                "adipiscing elit. Nunc est libero, vehicula " +
                "nec molestie in, semper aliquam magna.";
                
This works and will continue to work, but I think there is concern that
this pattern won't work when multi-line string literals are added to the
equation. There has been some debate about various machinations that
could be be used.

Some of the parameters;

- The solution needs to be an escape sequence(s). This is the only
  mechanism we can introduce (now) and be backward compatible with
  traditional string literals. Other mechanisms, such as literal
  prefixing, are not open for discussion at this point in time. (+1)
  
- A Multi-line String Literal JEP goal is to make all escape sequences
  equally meaningful for traditional string literals and multi-line
  string literals. (+1)
  
- \<LineTerminator>, \<Space> and \<WhiteSpace> (white space includes LF
  and CR) have been proposed with various semantics for each. There is a
  concern that the lack of visibility for what comes after the \. Is it a
  space, tab, unicode white space, LF or CR? How do you tell? (?1)
  
- When the new escape sequence(s) is in a traditional string literal the
  compiler scanner needs to treat the traditional string literal as
  multi-line. (-1)
  
The escape sequences suggested differ, but they are all variations of
consuming the escape and zero to N characters after (or before).

A) \<LineTerminator> or \<WhiteSpace> Just consume the (single)
   line terminator/white space.

Sample,

    String tsl = "Lorem ipsum dolor sit amet, consectetur \
                  adipiscing elit. Nunc est libero, vehicula \
                  nec molestie in, semper aliquam magna.";

    String msl = """
                 Lorem ipsum dolor sit amet, consectetur \
                 adipiscing elit. Nunc est libero, vehicula \
                 nec molestie in, semper aliquam magna.""";

This works if the line terminator follows immediately after the \ . (+1)

Can not tell if it is a white space or line terminator after the \ . (-1)

This does not work if there is one or more intervening white space
characters. (-1)

This works for multi-line string literals because of stripTrailing. (+1)

This does not work for traditional string literals because there is no
notion of auto alignment to strip the leading white space on the next
line. (-2)

B) \<WhiteSpace> Consume all white space up to and including the line
   terminator.

    Same sample as A).
    
Works in more cases than A). (+2)
    
Still does not work for traditional string literals because there is no
notion of auto alignment to strip the leading white space on the next
line. (-2)

C) \<WhiteSpace> Consume all white space (including LF and CR) up to a
   non-white space or end of string.

    Same sample as A).
    
This works for both traditional and multi-line strings. (+1)

Note that in A), B) and C) the next line may influence multi-line
indentation. I.E., escapes are translated after auto alignment. (?1)

D) \, (something other that white space) but otherwise the same as C) 

    String tsl = "Lorem ipsum dolor sit amet, consectetur \,
                  adipiscing elit. Nunc est libero, vehicula \,
                  nec molestie in, semper aliquam magna.";

    String msl = """
                 Lorem ipsum dolor sit amet, consectetur \,
                 adipiscing elit. Nunc est libero, vehicula \,
                 nec molestie in, semper aliquam magna.""";

Works but trading " + for \, . (?1)

E) \> (something other that white space)
      Consume all white space up to and including the line terminator.
   \< (something other that white space)
      Consume all white space back to beginning of line.

    String tsl = "Lorem ipsum dolor sit amet, consectetur \>
                  \<adipiscing elit. Nunc est libero, vehicula \>
                  \<nec molestie in, semper aliquam magna.";

    String msl = """
                 Lorem ipsum dolor sit amet, consectetur \>
                 \<adipiscing elit. Nunc est libero, vehicula \>
                 \<nec molestie in, semper aliquam magna.""";

A goal of the multi-line JEP was to make the string more readable, less
error prone and maintainable. (-10)

Note for D) and E), is it an error if a non-white space is encountered
or just stop? (?1)






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20190509/68f0ebb9/attachment-0001.html>


More information about the amber-spec-experts mailing list