RFR: 8013395 StringBuffer.toString performance regression impacting embedded benchmarks
David Holmes
david.holmes at oracle.com
Fri May 10 06:47:42 UTC 2013
Hi David,
On 10/05/2013 4:39 PM, David Schlosnagle wrote:
> One other quick comment, if the toStringCache is non-null during
> invocation of toString(), that seems to imply that the
> StringBuffer/StringBuilder has not been mutated since the last
> invocation of toString(), is there any reason to still use the String
> copy constructor? This would address the
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6239985 portion of
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959.
>
> For example, I'd envisage in AbstractStringBuilder (assuming previous
> comments of pushing toStringCache from StringBuffer to
> AbstractStringBuilder):
>
> @Override
> public String toString() {
> if (toStringCache == null) {
> toStringCache = new String(value, 0, count);
> }
> return toStringCache;
> }
Right this was our first thought too, but the specification for toString
states that a new String is created. So to go this path we'd also have
to push through a spec change for StringBuilder/StringBuffer. Given this
is an obscure corner case I wanted to go the most minimal route
possible. The copy-constructor doesn't copy the array which is what
6239985 was complaining about.
Thanks,
David
> - Dave
>
>
> On Fri, May 10, 2013 at 2:25 AM, David Schlosnagle <schlosna at gmail.com
> <mailto:schlosna at gmail.com>> wrote:
>
> Hi David,
>
> Would it be beneficial to push the toStringCache up to
> AbstractStringBuilder so that StringBuilder.toString() benefits from
> this cache as well? It seems like this would affect both
> StringBuilder and StringBuffer for repeated calls to toString(),
> although StringBuffer would obviously have the synchronization
> overhead as well (assuming the locking is not elided away by HotSpot).
>
> Thanks,
> Dave
>
>
> On Fri, May 10, 2013 at 2:03 AM, David Holmes
> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>
> Short version:
>
> Cache the value returned by toString and use it to
> copy-construct a new String on subsequent calls to toString().
> Clear the cache on any mutating operation.
>
> webrev: http://cr.openjdk.java.net/~__dholmes/8013395/webrev.v2/
> <http://cr.openjdk.java.net/~dholmes/8013395/webrev.v2/>
>
> Testing: microbenchmark for toString performance; new regression
> test for correctness; JPRT testset core as a sanity check
>
> Still TBD - full SE benchmark (?)
>
> Thanks,
> David
> ---------
>
> Long version:
>
> One of the goals for JDK8 is to provide a path from Java ME CDC
> to Java SE (or SE Embedded). In the embedded space some pretty
> old benchmarks still get used for doing comparisons between
> JRE's. One of which makes heavy use of StringBuffer.toString,
> without modifying the StringBuffer in between.
>
> Up to Java 1.4.2 a StringBuffer and a String could share the
> underlying char[]. This meant that toString simply needed to
> create a new String that referenced the StringBuffer's char[]
> with no copying of the array needed. In Java 5 the
> String/StringBuffer implementations were completely revised:
> StringBuilder was introduced for non-synchronized use, and a new
> AbstractStringBuilder base class added for it and StringBuffer.
> In that implementation toString now has to copy the
> StringBuffer's char[]. This resulted in a significant
> performance regression for toString() and a bug - 6219959 - was
> opened. There is quite an elaborate evaluation in that bug
> report but bottom line was that "real code doesn't depend on
> this - won't fix".
>
> At some stage ME also updated to the new Java 5 code and they
> also noticed the problem. As a result CDC6 included a variation
> of the caching strategy that is proposed here.
>
> Going forward because we want people to be able to compare ME
> and SE with their familiar benchmarks, we would like to address
> this corner case and fix it using the caching strategy outlined.
> As a data point an 8K StringBuffer that takes ~1ms to be
> converted to a String initially, can process subsequent
> toString() calls in a few microseconds. So that performance
> issue is addressed.
>
> However we've added a write to a field in all the mutating
> methods which obviously adds some additional computational
> effort - though I have no doubt it is lost in the noise for all
> but the smallest of mutating methods. Even so this should be run
> against regular SE benchmarks to ensure there are no performance
> regressions there - so if anyone has a suggestion as to the best
> benchmark to run to exercise StringBuffer (if it exists), please
> let me know.
>
> Thanks for reading this far :)
>
>
>
More information about the core-libs-dev
mailing list