PROPOSAL: String literals: version 1.1

rssh at gradsoft.com.ua rssh at gradsoft.com.ua
Tue Mar 3 02:15:32 PST 2009


AUTHOR(s): Ruslan Shevchenko, Jeremy Manson (if agree), Reinier
Zwitserloot (if agree)

OVERVIEW:

FEATURE SUMMARY:
new string literals in java language:
* multiline string literals.
* string literals without escape processing.

MAJOR ADVANTAGE:
Possibility more elegant to code strings from  other languages, such as
sql constructions or inline xml (for multiline strings) or regular
expressions
(for string literals without escape processing).

MAJOR DISADVANTAGE
I don't know

ALTERNATIVES:

For multiline strings use operations and concatenation methods, such as:

  String,contact("Multiline \n",
                 "string ");

 or

  String bigString="First line\n"+
                   "second line"

For unescaped ('row') strings - use escaping of ordinary java string.


EXAMPLES

SIMPLE EXAMPLE:

Multiline string:

 <pre>
  StringBuilder sb = new StringBuilder();
  sb.append("""select a from Area a, CountryCodes cc
                where
                   cc.isoCode='UA'
                  and
                   a.owner = cc.country
              """);
  if (question.getAreaName()!=null) {
     sb.append("""and
                  a.name like ?
               """);
     sqlParams.setString(++i,question.getAreaName());
  }
 </pre>

 instead:
 <pre>
  StringBuilder sb = new StringBuilder();
  sb.append("select a from Area a, CountryCodes cc\n");
  sb.append("where cc.isoCode='UA'\n");
  sb.append("and a.owner=cc.country'\n");
  if (question.getAreaName()!=null) {
     sb.append("and a.name like ?");
     sqlParams.setString(++i,question.getAreaName());
  }
 </pre>

Unescaped String:
 <pre>
 String myParrern=''..*\.*'';
 </pre>
 instead
 <pre>
 String myParrern="\..*\\.*";
 </pre>

ADVANCED EXAMPLE:

 String platformDepended="""q
 """;
 is 'q\n' if compiled on Unix and 'q\n\r' if compiled on Windows.

 String platformIndepended="""
 """U;
 is always '\n'.

 String platformIndepended="""
 """W;
 is always '\n\r'.

 String empty="""
 """;
 is empty

String foo = """
     bar
                 baz
       bla
     qux";

is equal to: String foo = "bar\n            baz\n  bla\nqux";

and the following:

String foo = """
    foo
bar""";

is a compile-time error.

String foo= """\"""" is "
String foo= """\""" """ is """

DETAILS:

Multiline strings are part of program text, which begin and ends by three
double quotes.

I. e. grammar in 3.10.5 of JLS can be extented as:

<pre>
MultilineStringLiteral:
        """ MultilineStringCharacters/opt """  LineTerminationSuffix/opt

MultilineStringCharacters:
        MultilineStringCharacter
        MultilineStringCharacters  (MultilineStringCharacter but not ")
        (MultilineStringCharacters but not "") "

MultilineStringCharacter:
        InputCharacter but not \
        EscapeSequence
        LineTermination

LineTerminationSuffix:
                      U | u | W | w

</pre>


Unescaped strings are part of program text, which begin and ends by two
single quotes.


<pre>
 RowStringLiteral:
                   '' RowInputCharacters/opt '' LineTerminationSuffix/opt

 RowInputCharacters:
                      ' (InputCharacter but not ')
                     |
                      (InputCharacter but not ') '
                     |
                      LineTermination
</pre>



COMPILATION:

Handling of multiline strings:

Text withing """ brackets processed in next way:

1. splitted to sequence of lines by line termination symbols.
2. escape sequences in each line are processed exactly as in ordinary Java
strings.
3. elimination of leading whitespaces are processed in next way:
  - at first determinated sequence of whitespace symbols (exclude
LineTermination, i.e. ST, HP, FF) at first nonempty line in sequence.
    let's call it 'leading whitespace sequence'
  - all next lines must start with same leading whitespace sequence,
otherwise compile-time error is thrown.
  - whitespace processing erase such leading sequence from resulting lines
4. set of lines after erasing of leading whitespace sequence is
concatenated, with line-termination sequences between two neighbour lines.
   Inserted linetermination sequence is depend from LineTerminationSuffix,
and is
    - value of systen property 'line.separator' is LineTerminationSuffix
is empty
    - LF (i. e. '\n') when LineTerminationSuffix is 'U' or 'u'
    - CR LF (i. e. '\r''\n') when LineTerminationSuffix is 'W' or 'w'



Handling of row strings:
Text withing '' brackets processed in next way:
1. splitted to sequence of lines by line termination symbols.
2. set of lines after erasing of leading whitespace sequence is
concatenated, with line-termination sequences between two neighbour lines,
 exactly as in case of multiline strings.

No escape processing, no leading whitespace elimination are performed for
receiving of resulting string value.

new strings literals created and used in .class files exactly as ordinary
strings.

TESTING:
Nothing special. add new  strings literals to test-cases.

LIBRARY SUPPORT:
None.

(May be exists sense add simple template processing to standard library, but
I think this is goal of next language iteration. Now exists many good
external
frameworks, such as velocity: better wait and standardize support of winner)

REFLECTIVE APIS: None

OTHER CHANGES: None

MIGRATION: None

COMPABILITY
 None

REFERENCES

 http://bugs.sun.com/view_bug.do?bug_id=4165111
 http://bugs.sun.com/view_bug.do?bug_id=4472509
 http://docs.google.com/View?docid=d36kv8n_32g9zj7pdd by  by Jacek
Furmankiewicz











More information about the coin-dev mailing list