PROPOSAL: String literals: version 1.1

Tue Mar 3 05:01:10 PST 2009

> String foo= """\""" """ is """

Is there really a need for escaping? If users want to escape, they can still use the regular "" string literals. I'd rather have no escaping at all. Otherwise, you probably also need to escape the \ if you want the string to contain \" at the end. """An example:\\"""". The rule 'The new string literals don't support any escape sequences' if much better than 'The new string literals only support the escape sequences \""", which means """ and \\, which means \\'.

Mind though, that \Uxxxx will always an escape sequence, since it is processed while before the source is sent to the tokenizer.

What about supporting " or "" at the beginning or end of the new string literals? That would result in """" or """"". Or even """""""" if you push it to the limits. Wouldn't that be hard on the compiler?

Roel

-----Oorspronkelijk bericht-----
Van: reinier at zwitserloot.com [mailto:coin-dev-bounces at openjdk.java.net] Namens Reinier Zwitserloot
Verzonden: dinsdag 3 maart 2009 11:33
Aan: coin-dev at openjdk.java.net
Onderwerp: Re: PROPOSAL: String literals: version 1.1

I don't think you need the U/W suffix; make all newlines \n regardless of what they are in the source file. If this annoys anybody, they can either manually add the \r in the string, or run a simple .replace("\n", "\r\n") at the end. I find collapsing whitespace far more interesting, but there too a simple method solution works just fine: .replaceAll("\\s+", " ")

The problem with the suffixes are the rarity: I doubt anybody would know its even legal in the first place, so anybody that does use it effectively makes his code unreadable until the reader looks up the exotic syntax.

Then again, there is precedence; the case of the hexadecimal floating point literal is almost point-for-point the same as this situation:  
Exotic syntax almost no java programmer even knows is legal ( double foo = 0x1P0D is legal java code!) but exists anyway because literals have a special meaning in java (they get inlined), which stops happening if you use an API utility to do the same thing:  
Double.longBitsToDouble. Still, in the 0x1P0D case there really is no alternative, whereas here you can always manually add \r to the string.

  --Reinier Zwitserloot

On Mar 3, 2009, at 11:15, rssh at gradsoft.com.ua wrote:

> AUTHOR(s): Ruslan Shevchenko, Jeremy Manson (if agree), Reinier 
> Zwitserloot (if agree)
>
> OVERVIEW:
>
> FEATURE SUMMARY:
> new string literals in java language:
> * multiline string literals.
> * string literals without escape processing.
>
> MAJOR ADVANTAGE:
> Possibility more elegant to code strings from  other languages, such 
> as sql constructions or inline xml (for multiline strings) or regular 
> expressions (for string literals without escape processing).
>
> MAJOR DISADVANTAGE
> I don't know
>
> ALTERNATIVES:
>
> For multiline strings use operations and concatenation methods, such
> as:
>
>  String,contact("Multiline \n",
>                 "string ");
>
> or
>
>  String bigString="First line\n"+
>                   "second line"
>
> For unescaped ('row') strings - use escaping of ordinary java string.
>
>
> EXAMPLES
>
> SIMPLE EXAMPLE:
>
> Multiline string:
>
> <pre>
>  StringBuilder sb = new StringBuilder();  sb.append("""select a from 
> Area a, CountryCodes cc
>                where
>                   cc.isoCode='UA'
>                  and
>                   a.owner = cc.country
>              """);
>  if (question.getAreaName()!=null) {
>     sb.append("""and
>                  a.name like ?
>               """);
>     sqlParams.setString(++i,question.getAreaName());
>  }
> </pre>
>
> instead:
> <pre>
>  StringBuilder sb = new StringBuilder();  sb.append("select a from 
> Area a, CountryCodes cc\n");  sb.append("where cc.isoCode='UA'\n");  
> sb.append("and a.owner=cc.country'\n");  if 
> (question.getAreaName()!=null) {
>     sb.append("and a.name like ?");
>     sqlParams.setString(++i,question.getAreaName());
>  }
> </pre>
>
> Unescaped String:
> <pre>
> String myParrern=''..*\.*'';
> </pre>
> instead
> <pre>
> String myParrern="\..*\\.*";
> </pre>
>
> ADVANCED EXAMPLE:
>
> String platformDepended="""q
> """;
> is 'q\n' if compiled on Unix and 'q\n\r' if compiled on Windows.
>
> String platformIndepended="""
> """U;
> is always '\n'.
>
> String platformIndepended="""
> """W;
> is always '\n\r'.
>
> String empty="""
> """;
> is empty
>
> String foo = """
>     bar
>                 baz
>       bla
>     qux";
>
> is equal to: String foo = "bar\n            baz\n  bla\nqux";
>
> and the following:
>
> String foo = """
>    foo
> bar""";
>
> is a compile-time error.
>
> String foo= """\"""" is "
> String foo= """\""" """ is """
>
> DETAILS:
>
> Multiline strings are part of program text, which begin and ends by 
> three double quotes.
>
> I. e. grammar in 3.10.5 of JLS can be extented as:
>
> <pre>
> MultilineStringLiteral:
>        """ MultilineStringCharacters/opt """  LineTerminationSuffix/ 
> opt
>
> MultilineStringCharacters:
>        MultilineStringCharacter
>        MultilineStringCharacters  (MultilineStringCharacter but not ")
>        (MultilineStringCharacters but not "") "
>
> MultilineStringCharacter:
>        InputCharacter but not \
>        EscapeSequence
>        LineTermination
>
> LineTerminationSuffix:
>                      U | u | W | w
>
> </pre>
>
>
> Unescaped strings are part of program text, which begin and ends by 
> two single quotes.
>
>
> <pre>
> RowStringLiteral:
>                   '' RowInputCharacters/opt '' LineTerminationSuffix/ 
> opt
>
> RowInputCharacters:
>                      ' (InputCharacter but not ')
>                     |
>                      (InputCharacter but not ') '
>                     |
>                      LineTermination
> </pre>
>
>
>
> COMPILATION:
>
> Handling of multiline strings:
>
> Text withing """ brackets processed in next way:
>
> 1. splitted to sequence of lines by line termination symbols.
> 2. escape sequences in each line are processed exactly as in ordinary 
> Java strings.
> 3. elimination of leading whitespaces are processed in next way:
>  - at first determinated sequence of whitespace symbols (exclude 
> LineTermination, i.e. ST, HP, FF) at first nonempty line in sequence.
>    let's call it 'leading whitespace sequence'
>  - all next lines must start with same leading whitespace sequence, 
> otherwise compile-time error is thrown.
>  - whitespace processing erase such leading sequence from resulting 
> lines 4. set of lines after erasing of leading whitespace sequence is 
> concatenated, with line-termination sequences between two neighbour 
> lines.
>   Inserted linetermination sequence is depend from 
> LineTerminationSuffix, and is
>    - value of systen property 'line.separator' is 
> LineTerminationSuffix is empty
>    - LF (i. e. '\n') when LineTerminationSuffix is 'U' or 'u'
>    - CR LF (i. e. '\r''\n') when LineTerminationSuffix is 'W' or 'w'
>
>
>
> Handling of row strings:
> Text withing '' brackets processed in next way:
> 1. splitted to sequence of lines by line termination symbols.
> 2. set of lines after erasing of leading whitespace sequence is 
> concatenated, with line-termination sequences between two neighbour 
> lines, exactly as in case of multiline strings.
>
> No escape processing, no leading whitespace elimination are performed 
> for receiving of resulting string value.
>
> new strings literals created and used in .class files exactly as 
> ordinary strings.
>
> TESTING:
> Nothing special. add new  strings literals to test-cases.
>
> LIBRARY SUPPORT:
> None.
>
> (May be exists sense add simple template processing to standard 
> library, but I think this is goal of next language iteration. Now 
> exists many good external frameworks, such as velocity: better wait 
> and standardize support of
> winner)
>
> REFLECTIVE APIS: None
>
> OTHER CHANGES: None
>
> MIGRATION: None
>
> COMPABILITY
> None
>
> REFERENCES
>
> http://bugs.sun.com/view_bug.do?bug_id=4165111
> http://bugs.sun.com/view_bug.do?bug_id=4472509
> http://docs.google.com/View?docid=d36kv8n_32g9zj7pdd by  by Jacek 
> Furmankiewicz
>
>
>
>
>
>
>
>