RFR: 8013395 StringBuffer.toString performance regression impacting embedded benchmarks

Tue May 14 05:10:36 UTC 2013

On 14/05/2013 5:05 AM, Alan Bateman wrote:
> On 13/05/2013 08:12, David Holmes wrote:
>> Thanks for all the feedback and discussion. I've rewritten the patch
>> to use Peter's suggestion of caching the char[] instead of the actual
>> String:
>>
>> http://cr.openjdk.java.net/~dholmes/8013395/webrev.v3/
>>
>> I too am concerned that any form of caching that avoids the array-copy
>> (which is the crux of the issue) is going to incur a 2x space penalty.
>> I can imagine old but well-behaved code that uses StringBuffer rather
>> than StringBuilder where very long buffers are created and toString()
>> only called once - and they may get immediate or subsequent OOME due
>> to the extra char[]; or excessive GC activity might be induced if a
>> lot of StringBuffers are used.
> Assuming the well-behaved case is where the StringBuffer is discarded
> after calling toString then I wouldn't expect it to too different to
> today. That is, you have the 2x penalty when toString is called now.

Thank you! I had been comparing the cached case with the ancient sharing 
case - where there was no space penalty. As you say in the existing code 
we already use 2x the space at the time toString is called, so no change 
in that regard. Doh!

> Clearly there are other usages which could be problematic.

Yes but we think those rare enough to not be a concern.

> I'm not sure what to say about the copy-on-change-after-toString
> approach. As the offset/count fields have been removed from String then
> it could only be the count == value.length case. It would clearly be a
> win in some cases but no help in others.

Right - I was still using a offset/count based mental model for String 
but that doesn't apply any more so we can't share directly except in one 
case. And from past experiences with StringBuffer issues it seems that 
accurately sizing the SB based on the expected final size is quite a 
rarity so the copy-on-write optimization is not worth pursuing (even if 
it were implementable in a reasonable way - thanks Peter for the 
additional investigation here!)

So here is hopefully final webrev:

http://cr.openjdk.java.net/~dholmes/8013395/webrev.v5/

It is the same approach as v3, but as Florian pointed out the cache 
should be cleared before the mutating action - just in case there is an 
exception.

That leaves one issue that was flagged by a couple of folks: hotspot 
intrinsification of specific "string" usage patterns. I tracked this 
down in the hotspot code and I think it only applies in situations where 
the StringBuffer/StringBuilder could be elided completely - and so would 
not be an issue here. But I'm confirming this with the hotspot compiler 
folk (unfortunately the optimization is not clearly documented anywhere.)

Thanks,
David

> -Alan.