PROPOSAL: String literals: version 1.1

Reinier Zwitserloot reinier at zwitserloot.com
Tue Mar 3 02:33:26 PST 2009


I don't think you need the U/W suffix; make all newlines \n regardless  
of what they are in the source file. If this annoys anybody, they can  
either manually add the \r in the string, or run a  
simple .replace("\n", "\r\n") at the end. I find collapsing whitespace  
far more interesting, but there too a simple method solution works  
just fine: .replaceAll("\\s+", " ")

The problem with the suffixes are the rarity: I doubt anybody would  
know its even legal in the first place, so anybody that does use it  
effectively makes his code unreadable until the reader looks up the  
exotic syntax.

Then again, there is precedence; the case of the hexadecimal floating  
point literal is almost point-for-point the same as this situation:  
Exotic syntax almost no java programmer even knows is legal ( double  
foo = 0x1P0D is legal java code!) but exists anyway because literals  
have a special meaning in java (they get inlined), which stops  
happening if you use an API utility to do the same thing:  
Double.longBitsToDouble. Still, in the 0x1P0D case there really is no  
alternative, whereas here you can always manually add \r to the string.

  --Reinier Zwitserloot



On Mar 3, 2009, at 11:15, rssh at gradsoft.com.ua wrote:

> AUTHOR(s): Ruslan Shevchenko, Jeremy Manson (if agree), Reinier
> Zwitserloot (if agree)
>
> OVERVIEW:
>
> FEATURE SUMMARY:
> new string literals in java language:
> * multiline string literals.
> * string literals without escape processing.
>
> MAJOR ADVANTAGE:
> Possibility more elegant to code strings from  other languages, such  
> as
> sql constructions or inline xml (for multiline strings) or regular
> expressions
> (for string literals without escape processing).
>
> MAJOR DISADVANTAGE
> I don't know
>
> ALTERNATIVES:
>
> For multiline strings use operations and concatenation methods, such  
> as:
>
>  String,contact("Multiline \n",
>                 "string ");
>
> or
>
>  String bigString="First line\n"+
>                   "second line"
>
> For unescaped ('row') strings - use escaping of ordinary java string.
>
>
> EXAMPLES
>
> SIMPLE EXAMPLE:
>
> Multiline string:
>
> <pre>
>  StringBuilder sb = new StringBuilder();
>  sb.append("""select a from Area a, CountryCodes cc
>                where
>                   cc.isoCode='UA'
>                  and
>                   a.owner = cc.country
>              """);
>  if (question.getAreaName()!=null) {
>     sb.append("""and
>                  a.name like ?
>               """);
>     sqlParams.setString(++i,question.getAreaName());
>  }
> </pre>
>
> instead:
> <pre>
>  StringBuilder sb = new StringBuilder();
>  sb.append("select a from Area a, CountryCodes cc\n");
>  sb.append("where cc.isoCode='UA'\n");
>  sb.append("and a.owner=cc.country'\n");
>  if (question.getAreaName()!=null) {
>     sb.append("and a.name like ?");
>     sqlParams.setString(++i,question.getAreaName());
>  }
> </pre>
>
> Unescaped String:
> <pre>
> String myParrern=''..*\.*'';
> </pre>
> instead
> <pre>
> String myParrern="\..*\\.*";
> </pre>
>
> ADVANCED EXAMPLE:
>
> String platformDepended="""q
> """;
> is 'q\n' if compiled on Unix and 'q\n\r' if compiled on Windows.
>
> String platformIndepended="""
> """U;
> is always '\n'.
>
> String platformIndepended="""
> """W;
> is always '\n\r'.
>
> String empty="""
> """;
> is empty
>
> String foo = """
>     bar
>                 baz
>       bla
>     qux";
>
> is equal to: String foo = "bar\n            baz\n  bla\nqux";
>
> and the following:
>
> String foo = """
>    foo
> bar""";
>
> is a compile-time error.
>
> String foo= """\"""" is "
> String foo= """\""" """ is """
>
> DETAILS:
>
> Multiline strings are part of program text, which begin and ends by  
> three
> double quotes.
>
> I. e. grammar in 3.10.5 of JLS can be extented as:
>
> <pre>
> MultilineStringLiteral:
>        """ MultilineStringCharacters/opt """  LineTerminationSuffix/ 
> opt
>
> MultilineStringCharacters:
>        MultilineStringCharacter
>        MultilineStringCharacters  (MultilineStringCharacter but not ")
>        (MultilineStringCharacters but not "") "
>
> MultilineStringCharacter:
>        InputCharacter but not \
>        EscapeSequence
>        LineTermination
>
> LineTerminationSuffix:
>                      U | u | W | w
>
> </pre>
>
>
> Unescaped strings are part of program text, which begin and ends by  
> two
> single quotes.
>
>
> <pre>
> RowStringLiteral:
>                   '' RowInputCharacters/opt '' LineTerminationSuffix/ 
> opt
>
> RowInputCharacters:
>                      ' (InputCharacter but not ')
>                     |
>                      (InputCharacter but not ') '
>                     |
>                      LineTermination
> </pre>
>
>
>
> COMPILATION:
>
> Handling of multiline strings:
>
> Text withing """ brackets processed in next way:
>
> 1. splitted to sequence of lines by line termination symbols.
> 2. escape sequences in each line are processed exactly as in  
> ordinary Java
> strings.
> 3. elimination of leading whitespaces are processed in next way:
>  - at first determinated sequence of whitespace symbols (exclude
> LineTermination, i.e. ST, HP, FF) at first nonempty line in sequence.
>    let's call it 'leading whitespace sequence'
>  - all next lines must start with same leading whitespace sequence,
> otherwise compile-time error is thrown.
>  - whitespace processing erase such leading sequence from resulting  
> lines
> 4. set of lines after erasing of leading whitespace sequence is
> concatenated, with line-termination sequences between two neighbour  
> lines.
>   Inserted linetermination sequence is depend from  
> LineTerminationSuffix,
> and is
>    - value of systen property 'line.separator' is  
> LineTerminationSuffix
> is empty
>    - LF (i. e. '\n') when LineTerminationSuffix is 'U' or 'u'
>    - CR LF (i. e. '\r''\n') when LineTerminationSuffix is 'W' or 'w'
>
>
>
> Handling of row strings:
> Text withing '' brackets processed in next way:
> 1. splitted to sequence of lines by line termination symbols.
> 2. set of lines after erasing of leading whitespace sequence is
> concatenated, with line-termination sequences between two neighbour  
> lines,
> exactly as in case of multiline strings.
>
> No escape processing, no leading whitespace elimination are  
> performed for
> receiving of resulting string value.
>
> new strings literals created and used in .class files exactly as  
> ordinary
> strings.
>
> TESTING:
> Nothing special. add new  strings literals to test-cases.
>
> LIBRARY SUPPORT:
> None.
>
> (May be exists sense add simple template processing to standard  
> library, but
> I think this is goal of next language iteration. Now exists many good
> external
> frameworks, such as velocity: better wait and standardize support of  
> winner)
>
> REFLECTIVE APIS: None
>
> OTHER CHANGES: None
>
> MIGRATION: None
>
> COMPABILITY
> None
>
> REFERENCES
>
> http://bugs.sun.com/view_bug.do?bug_id=4165111
> http://bugs.sun.com/view_bug.do?bug_id=4472509
> http://docs.google.com/View?docid=d36kv8n_32g9zj7pdd by  by Jacek
> Furmankiewicz
>
>
>
>
>
>
>
>




More information about the coin-dev mailing list