Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream

David Holmes David.Holmes at oracle.com
Thu May 13 22:46:09 UTC 2010


CR: 6952330 Fix for 6933217 broke contract of StringBuffer.ensureCapacity

Thanks,
David

David Holmes said the following on 05/13/10 11:59:
> Hi Martin,
> 
> Bugtraq is offline so I can't file a CR right now.
> 
> The was caught by Mauve tests. Kelly sent a link to the mailing list but 
> I don't think he's a member so it's probably held up for approval.
> 
> Martin Buchholz said the following on 05/13/10 11:38:
>> Webrev at
>>
>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/StringBuffer-capacity/StringBuffer-capacity.patch 
>>
> 
> Fix looks fine. Took me a few tries to confirm the test logic :)
> 
> Thanks,
> David
> -----
> 
> 
>> Martin
>>
>> On Wed, May 12, 2010 at 18:23, Martin Buchholz <martinrb at google.com> 
>> wrote:
>>> Alright, I'm convinced this is a bug.  Please file one in bugtraq
>>> (or expunge from tl in the unlikely event that would be permitted).
>>> It appears we very carefully specified the capacity manipulation 
>>> behavior,
>>> but didn't have any jtreg tests for it.
>>> Do y'all run the Mauve tests?  (Probably a good idea)
>>>
>>> Martin
>>>
>>> On Wed, May 12, 2010 at 17:31, David Holmes <David.Holmes at oracle.com> 
>>> wrote:
>>>> Chris, Martin,
>>>>
>>>> The changes to AbstractStringBuilder violate the specification for
>>>> ensureCapacity: it has to grow from N to 2N+2. The new code in:
>>>>
>>>> http://hg.openjdk.java.net/jdk7/tl-gate/jdk/rev/ec45423a4700
>>>>
>>>>  void expandCapacity(int minimumCapacity) {
>>>> - int newCapacity = (value.length + 1) * 2;
>>>> + int newCapacity = value.length * 2;
>>>>
>>>> Has dropped the +2 part.
>>>>
>>>> This is causing test failures.
>>>>
>>>> David
>>>> -----
>>>>
>>>> Chris Hegarty said the following on 04/16/10 19:38:
>>>>> Martin Buchholz wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> I recently discovered another place to handle huge arrays better - in
>>>>>> AbstractCollection.
>>>>>> I've put those changes into
>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize2/
>>>>>> I propose to qfold these into the original changes for this bug
>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>>>>> which have not yet been committed.
>>>>> Good catch Martin, these additional changes look good.
>>>>>
>>>>> -Chris.
>>>>>
>>>>>> Martin
>>>>>>
>>>>>> On Tue, Mar 9, 2010 at 04:41, Christopher Hegarty -Sun Microsystems
>>>>>> Ireland <Christopher.Hegarty at sun.com> wrote:
>>>>>>> Sorry Martin, I appear to have missed your original request to 
>>>>>>> file this
>>>>>>> bug. I since filed the following:
>>>>>>>
>>>>>>>  6933217: Huge arrays handled poorly in core libraries
>>>>>>>
>>>>>>> The changes you are proposing seem reasonable to me.
>>>>>>>
>>>>>>> -Chris.
>>>>>>>
>>>>>>> Martin Buchholz wrote:
>>>>>>>> [Chris or Alan, please review and file a bug]
>>>>>>>>
>>>>>>>> OK, guys,
>>>>>>>>
>>>>>>>> Here's a patch:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern 
>>>>>>>> <kevin.l.stern at gmail.com>
>>>>>>>> wrote:
>>>>>>>>> Hi Martin,
>>>>>>>>>
>>>>>>>>> Thank you for your reply.  If I may, PriorityQueue appears to 
>>>>>>>>> employ
>>>>>>>>> the
>>>>>>>>> simple strategy that I suggested above in its grow method:
>>>>>>>>>
>>>>>>>>>       int newCapacity = ((oldCapacity < 64)?
>>>>>>>>>                          ((oldCapacity + 1) * 2):
>>>>>>>>>                          ((oldCapacity / 2) * 3));
>>>>>>>>>       if (newCapacity < 0) // overflow
>>>>>>>>>           newCapacity = Integer.MAX_VALUE;
>>>>>>>>>
>>>>>>>>> It might be desirable to set a common strategy for capacity 
>>>>>>>>> increase
>>>>>>>>> for
>>>>>>>>> all
>>>>>>>>> collections.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kevin
>>>>>>>>>
>>>>>>>>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz 
>>>>>>>>> <martinrb at google.com>
>>>>>>>>> wrote:
>>>>>>>>>> Hi Kevin,
>>>>>>>>>>
>>>>>>>>>> As you've noticed, creating objects within a factor of two of
>>>>>>>>>> their natural limits is a good way to expose lurking bugs.
>>>>>>>>>>
>>>>>>>>>> I'm the one responsible for the algorithm in ArrayList.
>>>>>>>>>> I'm a bit embarrassed, looking at that code today.
>>>>>>>>>> We could set the array size to Integer.MAX_VALUE,
>>>>>>>>>> but then you might hit an independent buglet in hotspot
>>>>>>>>>> that you cannot allocate an array with Integer.MAX_VALUE
>>>>>>>>>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>>>>>>>>>
>>>>>>>>>> It occurs to me that increasing the size by 50% is better done by
>>>>>>>>>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>>>>>>>>>>
>>>>>>>>>> I agree with the plan of setting the capacity to something near
>>>>>>>>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>>>>>>>>>
>>>>>>>>>> These bugs are not known.
>>>>>>>>>> Chris Hegarty, could you file a bug for us?
>>>>>>>>>>
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern
>>>>>>>>>> <kevin.l.stern at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable 
>>>>>>>>>>> and
>>>>>>>>>>> java.io.ByteArrayOutputStream which arise when the capacities 
>>>>>>>>>>> of the
>>>>>>>>>>> data
>>>>>>>>>>> structures reach a particular threshold.  More below.
>>>>>>>>>>>
>>>>>>>>>>> When the capacity of an ArrayList reaches 
>>>>>>>>>>> (2/3)*Integer.MAX_VALUE
>>>>>>>>>>> its
>>>>>>>>>>> size
>>>>>>>>>>> reaches its capacity and an add or an insert operation is 
>>>>>>>>>>> invoked,
>>>>>>>>>>> the
>>>>>>>>>>> capacity is increased by only one element.  Notice that in the
>>>>>>>>>>> following
>>>>>>>>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to
>>>>>>>>>>> (3/2)
>>>>>>>>>>> *
>>>>>>>>>>> oldCapacity + 1 unless this value would not suffice to 
>>>>>>>>>>> accommodate
>>>>>>>>>>> the
>>>>>>>>>>> required capacity in which case it is set to the required 
>>>>>>>>>>> capacity.
>>>>>>>>>>>  If
>>>>>>>>>>> the
>>>>>>>>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then
>>>>>>>>>>> (oldCapacity
>>>>>>>>>>> *
>>>>>>>>>>> 3)/2 + 1 overflows and resolves to a negative number 
>>>>>>>>>>> resulting in
>>>>>>>>>>> the
>>>>>>>>>>> new
>>>>>>>>>>> capacity being set to the required capacity.  The major 
>>>>>>>>>>> consequence
>>>>>>>>>>> of
>>>>>>>>>>> this
>>>>>>>>>>> is that each subsequent add/insert operation results in a full
>>>>>>>>>>> resize
>>>>>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>> ArrayList causing performance to degrade significantly.
>>>>>>>>>>>
>>>>>>>>>>>       int newCapacity = (oldCapacity * 3)/2 + 1;
>>>>>>>>>>>           if (newCapacity < minCapacity)
>>>>>>>>>>>       newCapacity = minCapacity;
>>>>>>>>>>>
>>>>>>>>>>> Hashtable breaks entirely when the size of its backing array 
>>>>>>>>>>> reaches
>>>>>>>>>>> (1/2) *
>>>>>>>>>>> Integer.MAX_VALUE and a rehash is necessary as is evident 
>>>>>>>>>>> from the
>>>>>>>>>>> following
>>>>>>>>>>> excerpt from rehash.  Notice that rehash will attempt to 
>>>>>>>>>>> create an
>>>>>>>>>>> array
>>>>>>>>>>> of
>>>>>>>>>>> negative size if the size of the backing array reaches (1/2) *
>>>>>>>>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and 
>>>>>>>>>>> resolves
>>>>>>>>>>> to a
>>>>>>>>>>> negative number.
>>>>>>>>>>>
>>>>>>>>>>>   int newCapacity = oldCapacity * 2 + 1;
>>>>>>>>>>>   HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>>>>>>>>>>>
>>>>>>>>>>> When the capacity of the backing array in a 
>>>>>>>>>>> ByteArrayOutputStream
>>>>>>>>>>> reaches
>>>>>>>>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a 
>>>>>>>>>>> write
>>>>>>>>>>> operation is invoked, the capacity of the backing array is 
>>>>>>>>>>> increased
>>>>>>>>>>> only by
>>>>>>>>>>> the required number of elements.  Notice that in the following
>>>>>>>>>>> excerpt
>>>>>>>>>>> from
>>>>>>>>>>> ByteArrayOutputStream.write(int) the new backing array 
>>>>>>>>>>> capacity is
>>>>>>>>>>> set
>>>>>>>>>>> to 2
>>>>>>>>>>> * buf.length unless this value would not suffice to 
>>>>>>>>>>> accommodate the
>>>>>>>>>>> required
>>>>>>>>>>> capacity in which case it is set to the required capacity.  
>>>>>>>>>>> If the
>>>>>>>>>>> current
>>>>>>>>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE 
>>>>>>>>>>> + 1,
>>>>>>>>>>> then
>>>>>>>>>>> buf.length << 1 overflows and resolves to a negative number
>>>>>>>>>>> resulting
>>>>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>> new capacity being set to the required capacity.  The major
>>>>>>>>>>> consequence
>>>>>>>>>>> of
>>>>>>>>>>> this, like with ArrayList, is that each subsequent write 
>>>>>>>>>>> operation
>>>>>>>>>>> results
>>>>>>>>>>> in a full resize of the ByteArrayOutputStream causing 
>>>>>>>>>>> performance to
>>>>>>>>>>> degrade
>>>>>>>>>>> significantly.
>>>>>>>>>>>
>>>>>>>>>>>   int newcount = count + 1;
>>>>>>>>>>>   if (newcount > buf.length) {
>>>>>>>>>>>           buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>>>>>>>>>>> newcount));
>>>>>>>>>>>   }
>>>>>>>>>>>
>>>>>>>>>>> It is interesting to note that any statements about the 
>>>>>>>>>>> amortized
>>>>>>>>>>> time
>>>>>>>>>>> complexity of add/insert operations, such as the one in the
>>>>>>>>>>> ArrayList
>>>>>>>>>>> javadoc, are invalidated by the performance related bugs.  One
>>>>>>>>>>> solution
>>>>>>>>>>> to
>>>>>>>>>>> the above situations is to set the new capacity of the 
>>>>>>>>>>> backing array
>>>>>>>>>>> to
>>>>>>>>>>> Integer.MAX_VALUE when the initial size calculation results in a
>>>>>>>>>>> negative
>>>>>>>>>>> number during a resize.
>>>>>>>>>>>
>>>>>>>>>>> Apologies if these bugs are already known.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Kevin
>>>>>>>>>>>



More information about the core-libs-dev mailing list