API change proposal: String concatenation boost
Rémi Forax
forax at univ-mlv.fr
Sun Sep 21 13:43:34 UTC 2008
Server Performance a écrit :
> Hello, this is my first collaboration to OpenJDK so sorry if I missed some
> step... And sorry for my English :-(
> This is my proposal to be discussed:
>
> THE GOAL: Boost the overall String concatenation / append operations.
>
> BACKGROUND / HISTORY:
> • At the beginning (JDK 1.0 days) we had String.concat() and StringBuffer to
> build Strings. Both approaches had initially bad performance.
> • Starting at JDK 1.4 (I think), a share-on-copy strategy was introduced in
> StringBuffer. The performance gain was obvious, but increased the needed
> head and in some cases produced some memory leak when reusing StringBuffer.
> • Starting at JDK 1.5, StringBuilder was introduced as the “unsyncronized
> version”, but also the copy-on-write optimization was undo, becoming an
> always copy scenario. Also, the String + operator is translated to
> StringBuilder.append() by javac. This has been discussed but no better
> alternative was found (see
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959 )
> • This current implementation generates several System.arraycopy() calls: at
> least one per append/insert/delete (two if expanding capacity) and a final
> one in the toString() method.
>
> STUDYING THE USES:
> • If we look at the uses of StringBuilder (both inside JDK code, in
> application server and/or final applications), in nearly 99% of times its is
> only used to create a String in a single-threaded context and (the most
> important fact) only using the append() and toString() methods.
> • Also, only in 5% of the instantiatings, the coder establishes the initial
> capacity. Many times doesn’t matter, but other times it is impossible to
> guess it or calculate it. And even worst: some times the coder fails in his
> guess: establishes to much initial capacity or too few.
>
There is no method append() but lot of methods append.
> MY PROPOSAL:
> • Create a new class java.lang.StringAppender implements Appendable
> 1. Mostly same in its exposed public constructors and methods than
> StringBuilder, but the only operations are the “append()” ones (no insert,
> no delete, no replace)
> 2. Internally represented as a String array
> 3. Only arraycopy() or create char arrays once, inside the toString() method
> (well, this isn’t completely true: also arraycopies when appending
> objects/variables other than String instances or char arrays, but the most
> typical operation is appending strings!)
>
You right about the fact the major use case is to use lot of append
in a loop but i not agree about the fact that this append is always
a append(String) or a append(Object).
append(char), append(int) are very popular too and
doesn't work well with your implementation.
So i don't think if it worth a new class.
> 4. Doesn’t need to stablish an initial capacity. Never more calculating it
> or guessing it. Always
> • Add a new constructor in the java.lang.String class (actually 5 new
> constructors for performance reasons, see below):
> 1. public String(String... strs)
> 2. public String(String str0, String str1)
> 3. public String(String str0, String str1, String str2)
> 4. public String(String str0, String str1, String str2, String str3)
> (NOTE: these 3 additional constructors are needed to boost appends of a
> small number of Strings, in which case the overload of creating the array
> and then looping inside is much greater than passing 2, 3 or 4 parameters in
> the constructor invocation).
>
Instead of using new constructors, i think implementing a new method
named join
see http://bugs.sun.com/view_bug.do?bug_id=5015163
is better.
"".join("hello","world"); => "helloworld"
It can be implemented exactly in the same way that your method
String.copyValuesInto() but i think it is more usefull.
> • Change the javac behavior: the String + operator must be translated into
> “new String(String... )” instead of “new
> StringBuilder().append().append()... ..toString()”
> • Revise other JDK sourcecodes to use StringAppender, and the rest of
> programs all around the world. (By the way in the Glassfish V2 sourcecode I
> see several String.concat() invocations; seems strange to me... )
> • So the new blueprints for String concatenation should be:
> 1. For append-only, not conditional concatenations, use the new String
> constructor. Example: String result = new String(part1, part2, part3,
> part4);
> 2. For append-only, conditional or looped concatenations, use the
> StringAppender class.
> 3. For other manipulations (insert, delete, replace), use StringBuilder
> 4. For a thread-safe version, use StringBuffer
>
> THE BOOST:
> As you can see in my microbenchmark results, executed in Linux x64 and
> Windows 32 bits (-server, -client, and -XX:+AggressiveOpts versions), we can
> achieve a boost between 1% and 167% (depends on the scenario and
> architecture). Well those values are the extremes, the typical gains go
> between 20% and 70%. I think these results are good enough to be taken into
> consideration :-)
>
> THE SOURCE CODE:
> See attachments, String.java.diff with the added code (it is clear), and
> StringAppender with the new proposed class.
>
About your code:
Please use String[] var instead of String var[], i know this is legal even
int f() [] { return new int[]{3}; }
is legal, but it's not the Java way.
In StringAppender.expandListCapacity should use Arrays.copyOf().
StringAppender.size() should not be public, it's error prone and
not very usefull.
> THE MICROBENCHMARK CODE:
> See attachment.
> Of course should be revised. I think I have made it correctly.
>
> THE MICROBENCHMARK RESULTS (varied to me about +/-1% in different executions
> due to the host load or whatever):
> See attached file. I think they are great...
>
>
> What do you think?
> Best regards,
> --Jesús Viñuales
>
cheers,
Rémi Forax
More information about the core-libs-dev
mailing list