String concatenation tweaks
Remi Forax
forax at univ-mlv.fr
Thu Mar 12 07:03:23 UTC 2015
Hi Louis,
On 03/11/2015 10:01 PM, Louis Wasserman wrote:
> OpenJDK's implementation of String concatenation compiles
>
> "foo" + bar + "quux" + baz
>
> into essentially the same bytecode as
>
> new StringBuilder()
> .append("foo")
> .append(bar)
> .append("quux")
> .append(baz)
> .toString()
>
> We've been successfully experimenting at Google with presizing the
> StringBuilder to avoid the need for rebuffering, with extensive
> consultation with martinrb@ and cushon at . I have not yet ported the
> patch to head, but wanted to bounce the idea off this list before
> doing so. Some key points of interest:
>
> * It suffices to provide an upper bound on the size, if that's not
> too much bigger than the real length. For example, for
> primitives, we use the bound of the maximum length of the toString
> of that primitive type: for example, a boolean is treated as
> having length bounded at 5.
> * Nonconstant Objects, including CharSequences, have their toString
> stored in a local. For example, "foo" + myStringBuilder would be
> compiled to approximately
>
> String myStringBuilderToString = myStringBuilder.toString();
> return new StringBuilder(3 + myStringBuilderToString.length())
> .append("foo")
> .append(myStringBuilderToString)
> .toString();
>
> This is necessary to deal with the possibility of mutation
> midexpression.
>
Interresting,
here you have two optimizations, one is to call toString() and store the
result in local variable for each objects to append, the second one is
to try to pre-calculate the size of the resulting String.
Do you have done some measurement of former without being combined with
the later ?
I ask that because I think that the code of OptimizeStringConcat only
works if Hotspot is able to determine that all the objects to append are
Strings.
> * (Nonconstant primitives are also stored in a local to preserve
> evaluation order and avoid mutation, but not converted to
> Strings. There might be some room for optimization here for
> primitive values coming from final fields or locals.)
> * Some mostly-redundant null checking is necessary to deal with the
> evil edge case where toString() returns null.
>
valueOf(valueOf(x)) is quite ugly but i don't see how to do better :(
> * Taking all the above into account, our benchmarks showed 15% CPU
> improvements and 25% fewer bytes allocated relative to the status
> quo, independent of -XX:+OptimizeStringConcat.
> * While we were at it, in the case of two arguments that are
> statically known to be Strings, our benchmarks show String.concat
> to be firmly more efficient than the StringBuilder, even in the
> presence of flags like -XX:+OptimizeStringConcat. This is
> arguably a separate optimization, but nonetheless effective; our
> benchmarks at the time suggested 40% CPU improvements and 60%
> fewer bytes allocated relative to the status quo.
>
> So for example, "foo" + myInt + myString + "bar" + myObj would be
> compiled to the equivalent of
>
> int myIntTmp = myInt;
> String myStringTmp = String.valueOf(myString); // defend against null
> String myObjTmp = String.valueOf(String.valueOf(myObj)); // defend
> against evil toString implementations returning null
>
> return new StringBuilder(
> 17 // length of "foo" (3) + max length of myInt (11) + length of
> "bar" (3)
> + myStringTmp.length()
> + myObjTmp.length())
> .append("foo")
> .append(myIntTmp)
> .append(myStringTmp)
> .append("bar")
> .append(myObjTmp)
> .toString();
>
> As far as language constraints go, the JLS is (apparently
> deliberately) vague about how string concatenation is implemented.
> "An implementation may choose to perform conversion and concatenation
> in one step to avoid creating and then discarding an intermediate
> String object. To increase the performance of repeated string
> concatenation, a Java compiler may use the StringBuffer class or a
> similar technique to reduce the number of intermediate String objects
> that are created by evaluation of an expression." We see no reason
> this approach would not qualify as a "similar technique."
>
> If these suggestions (and performance numbers) are of interest, I can
> port our patch for upstream use.
cheers,
Rémi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20150312/4bbcc00d/attachment.html>
More information about the compiler-dev
mailing list