String concatenation tweaks

Alex Buckley alex.buckley at oracle.com
Thu Mar 12 23:28:20 UTC 2015


More abstract presentation. Given the expression:

   "foo" + m() + n()

you must not evaluate n() if evaluation of "foo" + m() completes 
abruptly. The proposed implementation evaluates n() regardless.

All is not lost. In the proposed implementation, the abrupt completion 
of "foo" + m() could occur because an append call fails or (thanks to 
Jon for pointing this out) the StringBuilder ctor fails. The 
quality-of-implementation issue is thus: if the proposed implementation 
is of sufficiently high quality to guarantee that the ctor and the first 
append both succeed, then the evaluation of "foo" + m() will always 
complete normally, and it would be an unobservable (thus acceptable) 
implementation detail to evaluate n() early.

Alex

On 3/11/2015 10:26 PM, Jeremy Manson wrote:
> Isn't Louis's proposed behavior equivalent to saying "the rightmost
> concatenation threw an OOME" instead of "some concatenation in the
> middle threw an OOME"?
>
> It's true that the intermediate String concatenations haven't occurred
> at that point, but in the JDK's current implementation, that's true,
> too: the concatenations that have occurred at that point are
> StringBuilder ones, not String ones.  If any of the append operations
> throws an OOME, no Strings have been created at all, either in Louis's
> implementation or in the JDK's.
>
> Ultimately, isn't this a quality of implementation issue?  And if so,
> isn't it a quality of implementation issue that doesn't provide any
> additional quality?  I can't imagine code whose semantics relies on
> this, and if they do, they are relying on something
> implementation-dependent.
>
> Jeremy
>
> On Wed, Mar 11, 2015 at 6:13 PM, Alex Buckley <alex.buckley at oracle.com
> <mailto:alex.buckley at oracle.com>> wrote:
>
>     On 3/11/2015 2:01 PM, Louis Wasserman wrote:
>
>         So for example, "foo" + myInt + myString + "bar" + myObj would be
>         compiled to the equivalent of
>
>         int myIntTmp = myInt;
>         String myStringTmp = String.valueOf(myString); // defend against
>         null
>         String myObjTmp = String.valueOf(String.valueOf(__myObj)); // defend
>         against evil toString implementations returning null
>
>         return new StringBuilder(
>                17 // length of "foo" (3) + max length of myInt (11) +
>         length of
>         "bar" (3)
>                + myStringTmp.length()
>                + myObjTmp.length())
>              .append("foo")
>              .append(myIntTmp)
>              .append(myStringTmp)
>              .append("bar")
>              .append(myObjTmp)
>              .toString();
>
>         As far as language constraints go, the JLS is (apparently
>         deliberately)
>         vague about how string concatenation is implemented.  "An
>         implementation
>         may choose to perform conversion and concatenation in one step
>         to avoid
>         creating and then discarding an intermediate String object. To
>         increase
>         the performance of repeated string concatenation, a Java
>         compiler may
>         use the StringBuffer class or a similar technique to reduce the
>         number
>         of intermediate String objects that are created by evaluation of an
>         expression."  We see no reason this approach would not qualify as a
>         "similar technique."
>
>
>     The really key property of the string concatenation operator is
>     left-associativity. Later subexpressions must not be evaluated until
>     earlier subexpressions have been successfully evaluated AND
>     concatenated. Consider this expression:
>
>        "foo" + m() + n()
>
>     which JLS8 15.8 specifies to mean:
>
>        ("foo" + m()) + n()
>
>     We know from JLS8 15.6 that if m() throws, then foo+m() throws, and
>     n() will never be evaluated.
>
>     Happily, your translation doesn't appear to catch and swallow
>     exceptions when eagerly evaluating each subexpression in turn, so I
>     believe you won't evaluate n() if m() already threw.
>
>     Unhappily, a call to append(..) can in general fail with
>     OutOfMemoryError. (I'm not talking about asynchronous exceptions in
>     general, but rather the sense that append(..) manipulates the heap
>     so an OOME is at least plausible.) In the OpenJDK implementation, if
>     blah.append(m()) fails with OOME, then n() hasn't been evaluated yet
>     -- that's "real" left-associativity. In the proposed implementation,
>     it's possible that more memory is available when evaluating m() and
>     n() upfront than at the time of an append call, so n() is evaluated
>     even if append(<<tmp result of m()>>) fails -- that's not
>     left-associative.
>
>     Perhaps you can set my mind at ease that append(..) can't fail with
>     OOME?
>
>     Alex
>
>


More information about the compiler-dev mailing list