Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
Goktug Gokdogan
gokdogan at gmail.com
Wed Mar 10 09:58:38 UTC 2010
Similarly,
BitSet.ensureCapacity
AbstractStringBuilder.expandCapacity
Vector.ensureCapacityHelper
methods need to have similar checks and/or throw proper exceptions.
By the way, I did not understand why IdentityHashMap and HashMap have
different MAXIMUM_CAPACITY and different logic to handle resize and
overflow.
On Mon, Mar 8, 2010 at 8:10 PM, Martin Buchholz <martinrb at google.com> wrote:
> [Chris or Alan, please review and file a bug]
>
> OK, guys,
>
> Here's a patch:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>
> Martin
>
> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > Hi Martin,
> >
> > Thank you for your reply. If I may, PriorityQueue appears to employ the
> > simple strategy that I suggested above in its grow method:
> >
> > int newCapacity = ((oldCapacity < 64)?
> > ((oldCapacity + 1) * 2):
> > ((oldCapacity / 2) * 3));
> > if (newCapacity < 0) // overflow
> > newCapacity = Integer.MAX_VALUE;
> >
> > It might be desirable to set a common strategy for capacity increase for
> all
> > collections.
> >
> > Regards,
> >
> > Kevin
> >
> > On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz <martinrb at google.com>
> wrote:
> >>
> >> Hi Kevin,
> >>
> >> As you've noticed, creating objects within a factor of two of
> >> their natural limits is a good way to expose lurking bugs.
> >>
> >> I'm the one responsible for the algorithm in ArrayList.
> >> I'm a bit embarrassed, looking at that code today.
> >> We could set the array size to Integer.MAX_VALUE,
> >> but then you might hit an independent buglet in hotspot
> >> that you cannot allocate an array with Integer.MAX_VALUE
> >> elements, but Integer.MAX_VALUE - 5 (or so) works.
> >>
> >> It occurs to me that increasing the size by 50% is better done by
> >> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
> >>
> >> I agree with the plan of setting the capacity to something near
> >> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
> >>
> >> These bugs are not known.
> >> Chris Hegarty, could you file a bug for us?
> >>
> >> Martin
> >>
> >> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern <kevin.l.stern at gmail.com>
> >> wrote:
> >> > Greetings,
> >> >
> >> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
> >> > java.io.ByteArrayOutputStream which arise when the capacities of the
> >> > data
> >> > structures reach a particular threshold. More below.
> >> >
> >> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
> >> > size
> >> > reaches its capacity and an add or an insert operation is invoked, the
> >> > capacity is increased by only one element. Notice that in the
> following
> >> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2)
> *
> >> > oldCapacity + 1 unless this value would not suffice to accommodate the
> >> > required capacity in which case it is set to the required capacity.
> If
> >> > the
> >> > current capacity is at least (2/3)*Integer.MAX_VALUE, then
> (oldCapacity
> >> > *
> >> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
> >> > new
> >> > capacity being set to the required capacity. The major consequence of
> >> > this
> >> > is that each subsequent add/insert operation results in a full resize
> of
> >> > the
> >> > ArrayList causing performance to degrade significantly.
> >> >
> >> > int newCapacity = (oldCapacity * 3)/2 + 1;
> >> > if (newCapacity < minCapacity)
> >> > newCapacity = minCapacity;
> >> >
> >> > Hashtable breaks entirely when the size of its backing array reaches
> >> > (1/2) *
> >> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
> >> > following
> >> > excerpt from rehash. Notice that rehash will attempt to create an
> array
> >> > of
> >> > negative size if the size of the backing array reaches (1/2) *
> >> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to
> a
> >> > negative number.
> >> >
> >> > int newCapacity = oldCapacity * 2 + 1;
> >> > HashtableEntry newTable[] = new HashtableEntry[newCapacity];
> >> >
> >> > When the capacity of the backing array in a ByteArrayOutputStream
> >> > reaches
> >> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
> >> > operation is invoked, the capacity of the backing array is increased
> >> > only by
> >> > the required number of elements. Notice that in the following excerpt
> >> > from
> >> > ByteArrayOutputStream.write(int) the new backing array capacity is set
> >> > to 2
> >> > * buf.length unless this value would not suffice to accommodate the
> >> > required
> >> > capacity in which case it is set to the required capacity. If the
> >> > current
> >> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
> >> > buf.length << 1 overflows and resolves to a negative number resulting
> in
> >> > the
> >> > new capacity being set to the required capacity. The major
> consequence
> >> > of
> >> > this, like with ArrayList, is that each subsequent write operation
> >> > results
> >> > in a full resize of the ByteArrayOutputStream causing performance to
> >> > degrade
> >> > significantly.
> >> >
> >> > int newcount = count + 1;
> >> > if (newcount > buf.length) {
> >> > buf = Arrays.copyOf(buf, Math.max(buf.length << 1,
> >> > newcount));
> >> > }
> >> >
> >> > It is interesting to note that any statements about the amortized time
> >> > complexity of add/insert operations, such as the one in the ArrayList
> >> > javadoc, are invalidated by the performance related bugs. One
> solution
> >> > to
> >> > the above situations is to set the new capacity of the backing array
> to
> >> > Integer.MAX_VALUE when the initial size calculation results in a
> >> > negative
> >> > number during a resize.
> >> >
> >> > Apologies if these bugs are already known.
> >> >
> >> > Regards,
> >> >
> >> > Kevin
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100310/6c75ac76/attachment.html>
More information about the core-libs-dev
mailing list