String concatenation tweaks
Alex Buckley
alex.buckley at oracle.com
Thu Mar 12 23:28:20 UTC 2015
More abstract presentation. Given the expression:
"foo" + m() + n()
you must not evaluate n() if evaluation of "foo" + m() completes
abruptly. The proposed implementation evaluates n() regardless.
All is not lost. In the proposed implementation, the abrupt completion
of "foo" + m() could occur because an append call fails or (thanks to
Jon for pointing this out) the StringBuilder ctor fails. The
quality-of-implementation issue is thus: if the proposed implementation
is of sufficiently high quality to guarantee that the ctor and the first
append both succeed, then the evaluation of "foo" + m() will always
complete normally, and it would be an unobservable (thus acceptable)
implementation detail to evaluate n() early.
Alex
On 3/11/2015 10:26 PM, Jeremy Manson wrote:
> Isn't Louis's proposed behavior equivalent to saying "the rightmost
> concatenation threw an OOME" instead of "some concatenation in the
> middle threw an OOME"?
>
> It's true that the intermediate String concatenations haven't occurred
> at that point, but in the JDK's current implementation, that's true,
> too: the concatenations that have occurred at that point are
> StringBuilder ones, not String ones. If any of the append operations
> throws an OOME, no Strings have been created at all, either in Louis's
> implementation or in the JDK's.
>
> Ultimately, isn't this a quality of implementation issue? And if so,
> isn't it a quality of implementation issue that doesn't provide any
> additional quality? I can't imagine code whose semantics relies on
> this, and if they do, they are relying on something
> implementation-dependent.
>
> Jeremy
>
> On Wed, Mar 11, 2015 at 6:13 PM, Alex Buckley <alex.buckley at oracle.com
> <mailto:alex.buckley at oracle.com>> wrote:
>
> On 3/11/2015 2:01 PM, Louis Wasserman wrote:
>
> So for example, "foo" + myInt + myString + "bar" + myObj would be
> compiled to the equivalent of
>
> int myIntTmp = myInt;
> String myStringTmp = String.valueOf(myString); // defend against
> null
> String myObjTmp = String.valueOf(String.valueOf(__myObj)); // defend
> against evil toString implementations returning null
>
> return new StringBuilder(
> 17 // length of "foo" (3) + max length of myInt (11) +
> length of
> "bar" (3)
> + myStringTmp.length()
> + myObjTmp.length())
> .append("foo")
> .append(myIntTmp)
> .append(myStringTmp)
> .append("bar")
> .append(myObjTmp)
> .toString();
>
> As far as language constraints go, the JLS is (apparently
> deliberately)
> vague about how string concatenation is implemented. "An
> implementation
> may choose to perform conversion and concatenation in one step
> to avoid
> creating and then discarding an intermediate String object. To
> increase
> the performance of repeated string concatenation, a Java
> compiler may
> use the StringBuffer class or a similar technique to reduce the
> number
> of intermediate String objects that are created by evaluation of an
> expression." We see no reason this approach would not qualify as a
> "similar technique."
>
>
> The really key property of the string concatenation operator is
> left-associativity. Later subexpressions must not be evaluated until
> earlier subexpressions have been successfully evaluated AND
> concatenated. Consider this expression:
>
> "foo" + m() + n()
>
> which JLS8 15.8 specifies to mean:
>
> ("foo" + m()) + n()
>
> We know from JLS8 15.6 that if m() throws, then foo+m() throws, and
> n() will never be evaluated.
>
> Happily, your translation doesn't appear to catch and swallow
> exceptions when eagerly evaluating each subexpression in turn, so I
> believe you won't evaluate n() if m() already threw.
>
> Unhappily, a call to append(..) can in general fail with
> OutOfMemoryError. (I'm not talking about asynchronous exceptions in
> general, but rather the sense that append(..) manipulates the heap
> so an OOME is at least plausible.) In the OpenJDK implementation, if
> blah.append(m()) fails with OOME, then n() hasn't been evaluated yet
> -- that's "real" left-associativity. In the proposed implementation,
> it's possible that more memory is available when evaluating m() and
> n() upfront than at the time of an append call, so n() is evaluated
> even if append(<<tmp result of m()>>) fails -- that's not
> left-associative.
>
> Perhaps you can set my mind at ease that append(..) can't fail with
> OOME?
>
> Alex
>
>
More information about the compiler-dev
mailing list