RFR: 8013395 StringBuffer.toString performance regression impacting embedded benchmarks

David Holmes david.holmes at oracle.com
Fri May 10 06:47:42 UTC 2013


Hi David,

On 10/05/2013 4:39 PM, David Schlosnagle wrote:
> One other quick comment, if the toStringCache is non-null during
> invocation of toString(), that seems to imply that the
> StringBuffer/StringBuilder has not been mutated since the last
> invocation of toString(), is there any reason to still use the String
> copy constructor? This would address the
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6239985 portion of
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959.
>
> For example, I'd envisage in AbstractStringBuilder (assuming previous
> comments of pushing toStringCache from StringBuffer to
> AbstractStringBuilder):
>
>      @Override
>      public String toString() {
>          if (toStringCache == null) {
>              toStringCache = new String(value, 0, count);
>          }
>          return toStringCache;
>      }

Right this was our first thought too, but the specification for toString 
states that a new String is created. So to go this path we'd also have 
to push through a spec change for StringBuilder/StringBuffer. Given this 
is an obscure corner case I wanted to go the most minimal route 
possible. The copy-constructor doesn't copy the array which is what 
6239985 was complaining about.

Thanks,
David

> - Dave
>
>
> On Fri, May 10, 2013 at 2:25 AM, David Schlosnagle <schlosna at gmail.com
> <mailto:schlosna at gmail.com>> wrote:
>
>     Hi David,
>
>     Would it be beneficial to push the toStringCache up to
>     AbstractStringBuilder so that StringBuilder.toString() benefits from
>     this cache as well? It seems like this would affect both
>     StringBuilder and StringBuffer for repeated calls to toString(),
>     although StringBuffer would obviously have the synchronization
>     overhead as well (assuming the locking is not elided away by HotSpot).
>
>     Thanks,
>     Dave
>
>
>     On Fri, May 10, 2013 at 2:03 AM, David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>
>         Short version:
>
>         Cache the value returned by toString and use it to
>         copy-construct a new String on subsequent calls to toString().
>         Clear the cache on any mutating operation.
>
>         webrev: http://cr.openjdk.java.net/~__dholmes/8013395/webrev.v2/
>         <http://cr.openjdk.java.net/~dholmes/8013395/webrev.v2/>
>
>         Testing: microbenchmark for toString performance; new regression
>         test for correctness; JPRT testset core as a sanity check
>
>         Still TBD - full SE benchmark (?)
>
>         Thanks,
>         David
>         ---------
>
>         Long version:
>
>         One of the goals for JDK8 is to provide a path from Java ME CDC
>         to Java SE (or SE Embedded). In the embedded space some pretty
>         old benchmarks still get used for doing comparisons between
>         JRE's. One of which makes heavy use of StringBuffer.toString,
>         without modifying the StringBuffer in between.
>
>         Up to Java 1.4.2 a StringBuffer and a String could share the
>         underlying char[]. This meant that toString simply needed to
>         create a new String that referenced the StringBuffer's char[]
>         with no copying of the array needed. In Java 5 the
>         String/StringBuffer implementations were completely revised:
>         StringBuilder was introduced for non-synchronized use, and a new
>         AbstractStringBuilder base class added for it and StringBuffer.
>         In that implementation toString now has to copy the
>         StringBuffer's char[]. This resulted in a significant
>         performance regression for toString() and a bug - 6219959 - was
>         opened. There is quite an elaborate evaluation in that bug
>         report but bottom line was that "real code doesn't depend on
>         this - won't fix".
>
>         At some stage ME also updated to the new Java 5 code and they
>         also noticed the problem. As a result CDC6 included a variation
>         of the caching strategy that is proposed here.
>
>         Going forward because we want people to be able to compare ME
>         and SE with their familiar benchmarks, we would like to address
>         this corner case and fix it using the caching strategy outlined.
>         As a data point an 8K StringBuffer that takes ~1ms to be
>         converted to a String initially, can process subsequent
>         toString() calls in a few microseconds. So that performance
>         issue is addressed.
>
>         However we've added a write to a field in all the mutating
>         methods which obviously adds some additional computational
>         effort - though I have no doubt it is lost in the noise for all
>         but the smallest of mutating methods. Even so this should be run
>         against regular SE benchmarks to ensure there are no performance
>         regressions there - so if anyone has a suggestion as to the best
>         benchmark to run to exercise StringBuffer (if it exists), please
>         let me know.
>
>         Thanks for reading this far :)
>
>
>



More information about the core-libs-dev mailing list