Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
Martin Buchholz
martinrb at google.com
Tue Mar 9 02:10:37 UTC 2010
[Chris or Alan, please review and file a bug]
OK, guys,
Here's a patch:
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
Martin
On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Martin,
>
> Thank you for your reply. If I may, PriorityQueue appears to employ the
> simple strategy that I suggested above in its grow method:
>
> int newCapacity = ((oldCapacity < 64)?
> ((oldCapacity + 1) * 2):
> ((oldCapacity / 2) * 3));
> if (newCapacity < 0) // overflow
> newCapacity = Integer.MAX_VALUE;
>
> It might be desirable to set a common strategy for capacity increase for all
> collections.
>
> Regards,
>
> Kevin
>
> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com> wrote:
>>
>> Hi Kevin,
>>
>> As you've noticed, creating objects within a factor of two of
>> their natural limits is a good way to expose lurking bugs.
>>
>> I'm the one responsible for the algorithm in ArrayList.
>> I'm a bit embarrassed, looking at that code today.
>> We could set the array size to Integer.MAX_VALUE,
>> but then you might hit an independent buglet in hotspot
>> that you cannot allocate an array with Integer.MAX_VALUE
>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>
>> It occurs to me that increasing the size by 50% is better done by
>> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
>>
>> I agree with the plan of setting the capacity to something near
>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>
>> These bugs are not known.
>> Chris Hegarty, could you file a bug for us?
>>
>> Martin
>>
>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > Greetings,
>> >
>> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>> > java.io.ByteArrayOutputStream which arise when the capacities of the
>> > data
>> > structures reach a particular threshold. More below.
>> >
>> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>> > size
>> > reaches its capacity and an add or an insert operation is invoked, the
>> > capacity is increased by only one element. Notice that in the following
>> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
>> > oldCapacity + 1 unless this value would not suffice to accommodate the
>> > required capacity in which case it is set to the required capacity. If
>> > the
>> > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
>> > *
>> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
>> > new
>> > capacity being set to the required capacity. The major consequence of
>> > this
>> > is that each subsequent add/insert operation results in a full resize of
>> > the
>> > ArrayList causing performance to degrade significantly.
>> >
>> > int newCapacity = (oldCapacity * 3)/2 + 1;
>> > if (newCapacity < minCapacity)
>> > newCapacity = minCapacity;
>> >
>> > Hashtable breaks entirely when the size of its backing array reaches
>> > (1/2) *
>> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
>> > following
>> > excerpt from rehash. Notice that rehash will attempt to create an array
>> > of
>> > negative size if the size of the backing array reaches (1/2) *
>> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
>> > negative number.
>> >
>> > int newCapacity = oldCapacity * 2 + 1;
>> > HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>> >
>> > When the capacity of the backing array in a ByteArrayOutputStream
>> > reaches
>> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>> > operation is invoked, the capacity of the backing array is increased
>> > only by
>> > the required number of elements. Notice that in the following excerpt
>> > from
>> > ByteArrayOutputStream.write(int) the new backing array capacity is set
>> > to 2
>> > * buf.length unless this value would not suffice to accommodate the
>> > required
>> > capacity in which case it is set to the required capacity. If the
>> > current
>> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
>> > buf.length << 1 overflows and resolves to a negative number resulting in
>> > the
>> > new capacity being set to the required capacity. The major consequence
>> > of
>> > this, like with ArrayList, is that each subsequent write operation
>> > results
>> > in a full resize of the ByteArrayOutputStream causing performance to
>> > degrade
>> > significantly.
>> >
>> > int newcount = count + 1;
>> > if (newcount > buf.length) {
>> > buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
>> > newcount));
>> > }
>> >
>> > It is interesting to note that any statements about the amortized time
>> > complexity of add/insert operations, such as the one in the ArrayList
>> > javadoc, are invalidated by the performance related bugs. One solution
>> > to
>> > the above situations is to set the new capacity of the backing array to
>> > Integer.MAX_VALUE when the initial size calculation results in a
>> > negative
>> > number during a resize.
>> >
>> > Apologies if these bugs are already known.
>> >
>> > Regards,
>> >
>> > Kevin
>> >
>
>
More information about the core-libs-dev
mailing list