String concatenation tweaks
Alex Buckley
alex.buckley at oracle.com
Thu Mar 12 01:13:51 UTC 2015
On 3/11/2015 2:01 PM, Louis Wasserman wrote:
> So for example, "foo" + myInt + myString + "bar" + myObj would be
> compiled to the equivalent of
>
> int myIntTmp = myInt;
> String myStringTmp = String.valueOf(myString); // defend against null
> String myObjTmp = String.valueOf(String.valueOf(myObj)); // defend
> against evil toString implementations returning null
>
> return new StringBuilder(
> 17 // length of "foo" (3) + max length of myInt (11) + length of
> "bar" (3)
> + myStringTmp.length()
> + myObjTmp.length())
> .append("foo")
> .append(myIntTmp)
> .append(myStringTmp)
> .append("bar")
> .append(myObjTmp)
> .toString();
>
> As far as language constraints go, the JLS is (apparently deliberately)
> vague about how string concatenation is implemented. "An implementation
> may choose to perform conversion and concatenation in one step to avoid
> creating and then discarding an intermediate String object. To increase
> the performance of repeated string concatenation, a Java compiler may
> use the StringBuffer class or a similar technique to reduce the number
> of intermediate String objects that are created by evaluation of an
> expression." We see no reason this approach would not qualify as a
> "similar technique."
The really key property of the string concatenation operator is
left-associativity. Later subexpressions must not be evaluated until
earlier subexpressions have been successfully evaluated AND
concatenated. Consider this expression:
"foo" + m() + n()
which JLS8 15.8 specifies to mean:
("foo" + m()) + n()
We know from JLS8 15.6 that if m() throws, then foo+m() throws, and n()
will never be evaluated.
Happily, your translation doesn't appear to catch and swallow exceptions
when eagerly evaluating each subexpression in turn, so I believe you
won't evaluate n() if m() already threw.
Unhappily, a call to append(..) can in general fail with
OutOfMemoryError. (I'm not talking about asynchronous exceptions in
general, but rather the sense that append(..) manipulates the heap so an
OOME is at least plausible.) In the OpenJDK implementation, if
blah.append(m()) fails with OOME, then n() hasn't been evaluated yet --
that's "real" left-associativity. In the proposed implementation, it's
possible that more memory is available when evaluating m() and n()
upfront than at the time of an append call, so n() is evaluated even if
append(<<tmp result of m()>>) fails -- that's not left-associative.
Perhaps you can set my mind at ease that append(..) can't fail with OOME?
Alex
More information about the compiler-dev
mailing list