RFR (XS) 8076759: AbstractStringBuilder.append(...) should ensure enough capacity for the upcoming "trivial" append calls

Fri May 1 17:19:11 UTC 2015

Hi Aleksey,

Is there any additional benefit by rounding up the next multiple of 4 or 8.
That would avoid a few wasted bytes at the end of the buffer modulo the 
allocation size.

Otherwise, looks fine to me also.

Roger

On 5/1/2015 10:09 AM, Aleksey Shipilev wrote:
> Anyone?
>
> -Aleksey
>
> On 04/24/2015 09:05 PM, Aleksey Shipilev wrote:
>> Hi,
>>
>> This seems to be a simple one-liner fix, but the background is more
>> complicated. See the bug:
>>    https://bugs.openjdk.java.net/browse/JDK-8076759
>>
>> The bottom line is that our current resizing policy in ASB is hostile
>> for long appends. There is a heuristics that extends the capacity to
>> match the *exact* length of append if doubling the array would not help.
>>
>> This heuristics has a nasty corner case: if there is an upcoming append
>> after a large one, then we are guaranteed to re-size again. If an
>> upcoming append is large in itself, the resizing is inevitable even
>> under the doubling-the-storage strategy; but if we only do a small
>> append, then we can be smarter.
>>
>> After trying a few options to fix this (see below), I have settled on
>> just adding a simple static "pad", to absorb the trivial appends after a
>> large append:
>>    http://cr.openjdk.java.net/~shade/8076759/webrev.00/
>>
>> The choice of "32" as magic number is deliberate: arraycopy likes large
>> power-of-two strides (and it does not like to make catch up loops for
>> small residuals). "16" is too small to fit the decimal representation of
>> Long.MIN_VALUE, therefore, we pick "32".
>>
>> There are other approaches, briefly mentioned here:
>>    http://cr.openjdk.java.net/~shade/8076759/patches.txt
>>
>> There is a direct correlation between the allocation pressure, and test
>> performance:
>>    http://cr.openjdk.java.net/~shade/8076759/data-perf.png
>>    http://cr.openjdk.java.net/~shade/8076759/data-foot.png
>>
>> Naively, one could expect doubling the storage ("mult2") until we reach
>> $minimalCapacity solves the problem, but it wastes too much memory, and
>> only reaches the "plus32" on power-of-two sizes. That is also the
>> Achilles' Heel of the heuristics, because appending the
>> power-of-two-plus-one-sized string will set us up for the original
>> problem. This effect can be alleviated by doing the padding as well
>> ("mult2-plus32"). Exactly the same trouble manifests on smaller strings
>> that go through the usual double-the-storage route, and this is why a
>> proposed patch makes the pad on common path.
>>
>> I do believe the current heuristics is smart about large appends, and
>> mult2* strategies undo it. Therefore, I would think keeping the
>> minimumCapacity cap is a good thing, and just adding the pad is a good
>> solution. Thus, it is in the webrev.
>>
>> Thanks,
>> -Aleksey.
>>
>