Raw String Literals

Jim Laskey james.laskey at oracle.com
Thu Mar 22 19:23:29 UTC 2018


Like all things, this is a tradeoff, and not necessarily one that admits to simple “fixes" like "just ban double-ticks." Let’s break it down.

1. The assumption that developers are unable to see double-backtick as anything other than an empty string is not valid. (Consider that single-quotes are used for chars in Java; no one thinks '' is the "empty char".) This is something that is learned. And there already is a way to express “empty string” in Java — "". Developers can learn the basic idea that raw string literals should be used when ordinary string literals fail them -- multiple lines, or undesired interpretation of special characters. Think double quotes first.

	varargs(`Hello`, ``, ` World! `, ``); ==>  varargs("Hello", "", " World! ", "");

2. We were well aware of Markdown’s treatment of ticks; this did indeed influence the design. Markdown, in fact, does support double-backticks but gets blurred in the discussion between spanning and fencing. I myself use double-backticks in Markdown frequently, because that’s how I write up documentation on raw string literals.

3. This “puzzler” isn’t really a puzzler; it’s just a speed bump on the way to learning the feature. Once you know the rules, it’s not confusing at all. And, IDEs will help with syntax coloring, that will make it obvious that this line doesn’t mean "pass an empty string”. It should also be possible for IDEs to choose the optimal/correct quoting for a given string body.

	String empty = ``;

IDE: Did you mean?

	String empty = "";

4. Let’s look at the proposed alternate solution, which is to disallow double-backticks. This may look like an obvious win, because it rescues those that assume that double-backtick is the empty string. But it also creates a cost, which is borne by all users: an anomaly. Learning the rule "any number of ticks" is easier than "any number of ticks, except the second most convenient number (two) you’d probably want to use.”

5. Any solution will introduce puzzlers: varargs(`Hello`, ````, ` World! `, ````);. Using a least puzzler criteria will not lead to a “better" solution.

Cheers,

— Jim


> On Mar 22, 2018, at 5:32 AM, Stephen Colebourne <scolebourne at joda.org> wrote:
> 
> On 21 March 2018 at 15:10, Jim Laskey <james.laskey at oracle.com> wrote:
>> We think things have "settled in" with respect to Raw String Literals language changes and library support. If all things fall in place, we will probably move http://openjdk.java.net/jeps/326 forward soon.
> 
> What does this print?
> 
> public void varargs(String... strs) {
>  System.out.println(Stream.of(strs).collect(joining()));
> }
> 
> varargs(`Hello`, ``, ` World! `, ``);
> 
> .
> 
> 
> scroll down for the answer...
> 
> 
> .
> 
> 
> .
> 
> 
> Answer: "Hello, `World!`, " (notice the extra commas, because there
> are only 2 arguments, not 4, due to no raw string literals)
> (thanks to Kevin B for the puzzler)
> 
> As I have stated elsewhere, I believe the trade offs in the raw string
> literal proposal are poor for Java. Not allowing an empty string and
> not allowing strings to start or end with backticks is surprising, and
> I believe the puzzler demonstrates the nasty effects of excluding an
> empty string literal (as also demonstrated here
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-March/052052.html).
> I do understand the appeal of being able to self-embed (the "stress
> test"), I just happen to think it is the wrong trade off.
> 
> 
> Fortunately, I believe there is a way for Java to have both empty raw
> string literals and full embedding. Simply disallow double backtick as
> a delimiter.
> 
> `foo` = "foo" (single backtick)
> `` = "" (empty string)
> ``foo`` = compile error
> ```foo``` - "foo" (triple backtick)
> `````` - compile error (no empty string here)
> `````foo````` - "foo" (five backticks)
> 
> There is a major specification that separates one backtick from three
> backticks - Markdown:
> http://spec.commonmark.org/0.27/#code-spans
> http://spec.commonmark.org/0.27/#fenced-code-blocks
> As such, I think this would be readily accepted by developers, who are
> often familiar with Markdown.
> 
> Were this approach to be adopted, I believe single backtick literals
> should not allow new lines. This would be a good simplification, and
> probably help IDE error recovery in many cases. The 3+ backtick
> literals would behave as per the current Oracle proposal, but with the
> benefit that when embedding just one or two backticks developers will
> not need to see or learn the rule about unlimited backticks. Thinks
> like regex would tend to use single backticks, while things like
> embedding XML or Javascript would use 3+ backticks.
> 
> In summary, not allowing an empty raw string is going to result in
> nasty puzzlers, and IMO unlimited delimiters by themselves is not in
> the spirit of Java. Following Markdown's approach of separation
> between single and 3+ backticks would provide considerable benefits to
> Java.
> 
> Stephen



More information about the amber-dev mailing list