Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
David Holmes
David.Holmes at oracle.com
Thu May 13 01:59:43 UTC 2010
Hi Martin,
Bugtraq is offline so I can't file a CR right now.
The was caught by Mauve tests. Kelly sent a link to the mailing list but
I don't think he's a member so it's probably held up for approval.
Martin Buchholz said the following on 05/13/10 11:38:
> Webrev at
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/StringBuffer-capacity/StringBuffer-capacity.patch
Fix looks fine. Took me a few tries to confirm the test logic :)
Thanks,
David
-----
> Martin
>
> On Wed, May 12, 2010 at 18:23, Martin Buchholz <martinrb at google.com> wrote:
>> Alright, I'm convinced this is a bug. Please file one in bugtraq
>> (or expunge from tl in the unlikely event that would be permitted).
>> It appears we very carefully specified the capacity manipulation behavior,
>> but didn't have any jtreg tests for it.
>> Do y'all run the Mauve tests? (Probably a good idea)
>>
>> Martin
>>
>> On Wed, May 12, 2010 at 17:31, David Holmes <David.Holmes at oracle.com> wrote:
>>> Chris, Martin,
>>>
>>> The changes to AbstractStringBuilder violate the specification for
>>> ensureCapacity: it has to grow from N to 2N+2. The new code in:
>>>
>>> http://hg.openjdk.java.net/jdk7/tl-gate/jdk/rev/ec45423a4700
>>>
>>> void expandCapacity(int minimumCapacity) {
>>> - int newCapacity = (value.length + 1) * 2;
>>> + int newCapacity = value.length * 2;
>>>
>>> Has dropped the +2 part.
>>>
>>> This is causing test failures.
>>>
>>> David
>>> -----
>>>
>>> Chris Hegarty said the following on 04/16/10 19:38:
>>>> Martin Buchholz wrote:
>>>>> Hi Chris,
>>>>>
>>>>> I recently discovered another place to handle huge arrays better - in
>>>>> AbstractCollection.
>>>>> I've put those changes into
>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize2/
>>>>> I propose to qfold these into the original changes for this bug
>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>>>> which have not yet been committed.
>>>> Good catch Martin, these additional changes look good.
>>>>
>>>> -Chris.
>>>>
>>>>> Martin
>>>>>
>>>>> On Tue, Mar 9, 2010 at 04:41, Christopher Hegarty -Sun Microsystems
>>>>> Ireland <Christopher.Hegarty at sun.com> wrote:
>>>>>> Sorry Martin, I appear to have missed your original request to file this
>>>>>> bug. I since filed the following:
>>>>>>
>>>>>> 6933217: Huge arrays handled poorly in core libraries
>>>>>>
>>>>>> The changes you are proposing seem reasonable to me.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>> Martin Buchholz wrote:
>>>>>>> [Chris or Alan, please review and file a bug]
>>>>>>>
>>>>>>> OK, guys,
>>>>>>>
>>>>>>> Here's a patch:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>> wrote:
>>>>>>>> Hi Martin,
>>>>>>>>
>>>>>>>> Thank you for your reply. If I may, PriorityQueue appears to employ
>>>>>>>> the
>>>>>>>> simple strategy that I suggested above in its grow method:
>>>>>>>>
>>>>>>>> int newCapacity = ((oldCapacity < 64)?
>>>>>>>> ((oldCapacity + 1) * 2):
>>>>>>>> ((oldCapacity / 2) * 3));
>>>>>>>> if (newCapacity < 0) // overflow
>>>>>>>> newCapacity = Integer.MAX_VALUE;
>>>>>>>>
>>>>>>>> It might be desirable to set a common strategy for capacity increase
>>>>>>>> for
>>>>>>>> all
>>>>>>>> collections.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com>
>>>>>>>> wrote:
>>>>>>>>> Hi Kevin,
>>>>>>>>>
>>>>>>>>> As you've noticed, creating objects within a factor of two of
>>>>>>>>> their natural limits is a good way to expose lurking bugs.
>>>>>>>>>
>>>>>>>>> I'm the one responsible for the algorithm in ArrayList.
>>>>>>>>> I'm a bit embarrassed, looking at that code today.
>>>>>>>>> We could set the array size to Integer.MAX_VALUE,
>>>>>>>>> but then you might hit an independent buglet in hotspot
>>>>>>>>> that you cannot allocate an array with Integer.MAX_VALUE
>>>>>>>>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>>>>>>>>
>>>>>>>>> It occurs to me that increasing the size by 50% is better done by
>>>>>>>>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>>>>>>>>>
>>>>>>>>> I agree with the plan of setting the capacity to something near
>>>>>>>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>>>>>>>>
>>>>>>>>> These bugs are not known.
>>>>>>>>> Chris Hegarty, could you file a bug for us?
>>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern
>>>>>>>>> <kevin.l.stern at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>>>>>>>>>> java.io.ByteArrayOutputStream which arise when the capacities of the
>>>>>>>>>> data
>>>>>>>>>> structures reach a particular threshold. More below.
>>>>>>>>>>
>>>>>>>>>> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE
>>>>>>>>>> its
>>>>>>>>>> size
>>>>>>>>>> reaches its capacity and an add or an insert operation is invoked,
>>>>>>>>>> the
>>>>>>>>>> capacity is increased by only one element. Notice that in the
>>>>>>>>>> following
>>>>>>>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to
>>>>>>>>>> (3/2)
>>>>>>>>>> *
>>>>>>>>>> oldCapacity + 1 unless this value would not suffice to accommodate
>>>>>>>>>> the
>>>>>>>>>> required capacity in which case it is set to the required capacity.
>>>>>>>>>> If
>>>>>>>>>> the
>>>>>>>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then
>>>>>>>>>> (oldCapacity
>>>>>>>>>> *
>>>>>>>>>> 3)/2 + 1 overflows and resolves to a negative number resulting in
>>>>>>>>>> the
>>>>>>>>>> new
>>>>>>>>>> capacity being set to the required capacity. The major consequence
>>>>>>>>>> of
>>>>>>>>>> this
>>>>>>>>>> is that each subsequent add/insert operation results in a full
>>>>>>>>>> resize
>>>>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>> ArrayList causing performance to degrade significantly.
>>>>>>>>>>
>>>>>>>>>> int newCapacity = (oldCapacity * 3)/2 + 1;
>>>>>>>>>> if (newCapacity < minCapacity)
>>>>>>>>>> newCapacity = minCapacity;
>>>>>>>>>>
>>>>>>>>>> Hashtable breaks entirely when the size of its backing array reaches
>>>>>>>>>> (1/2) *
>>>>>>>>>> Integer.MAX_VALUE and a rehash is necessary as is evident from the
>>>>>>>>>> following
>>>>>>>>>> excerpt from rehash. Notice that rehash will attempt to create an
>>>>>>>>>> array
>>>>>>>>>> of
>>>>>>>>>> negative size if the size of the backing array reaches (1/2) *
>>>>>>>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves
>>>>>>>>>> to a
>>>>>>>>>> negative number.
>>>>>>>>>>
>>>>>>>>>> int newCapacity = oldCapacity * 2 + 1;
>>>>>>>>>> HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>>>>>>>>>>
>>>>>>>>>> When the capacity of the backing array in a ByteArrayOutputStream
>>>>>>>>>> reaches
>>>>>>>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>>>>>>>>>> operation is invoked, the capacity of the backing array is increased
>>>>>>>>>> only by
>>>>>>>>>> the required number of elements. Notice that in the following
>>>>>>>>>> excerpt
>>>>>>>>>> from
>>>>>>>>>> ByteArrayOutputStream.write(int) the new backing array capacity is
>>>>>>>>>> set
>>>>>>>>>> to 2
>>>>>>>>>> * buf.length unless this value would not suffice to accommodate the
>>>>>>>>>> required
>>>>>>>>>> capacity in which case it is set to the required capacity. If the
>>>>>>>>>> current
>>>>>>>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1,
>>>>>>>>>> then
>>>>>>>>>> buf.length << 1 overflows and resolves to a negative number
>>>>>>>>>> resulting
>>>>>>>>>> in
>>>>>>>>>> the
>>>>>>>>>> new capacity being set to the required capacity. The major
>>>>>>>>>> consequence
>>>>>>>>>> of
>>>>>>>>>> this, like with ArrayList, is that each subsequent write operation
>>>>>>>>>> results
>>>>>>>>>> in a full resize of the ByteArrayOutputStream causing performance to
>>>>>>>>>> degrade
>>>>>>>>>> significantly.
>>>>>>>>>>
>>>>>>>>>> int newcount = count + 1;
>>>>>>>>>> if (newcount > buf.length) {
>>>>>>>>>> buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>>>>>>>>>> newcount));
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> It is interesting to note that any statements about the amortized
>>>>>>>>>> time
>>>>>>>>>> complexity of add/insert operations, such as the one in the
>>>>>>>>>> ArrayList
>>>>>>>>>> javadoc, are invalidated by the performance related bugs. One
>>>>>>>>>> solution
>>>>>>>>>> to
>>>>>>>>>> the above situations is to set the new capacity of the backing array
>>>>>>>>>> to
>>>>>>>>>> Integer.MAX_VALUE when the initial size calculation results in a
>>>>>>>>>> negative
>>>>>>>>>> number during a resize.
>>>>>>>>>>
>>>>>>>>>> Apologies if these bugs are already known.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Kevin
>>>>>>>>>>
More information about the core-libs-dev
mailing list