Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
develop4lasu at gmail.com
develop4lasu at gmail.com
Fri Mar 5 09:47:39 UTC 2010
Hello,
I'm using my own Collections if it's possible so I can add some thoughts:
1. I would decrease default array size to 4/6/8, for me it was few Mb more
of free memory ( i suggest testing on application that use at least 300Mb)
I would test:
initial size: 4
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
initial size: 6
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
initial size: 8
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 1) + 2;
initial size: 4
long newCapacity = ((long)oldCapacity) + (oldCapacity >> 2) + 4;
and then use:
> (int)Math.min(newCapacity, Integer.MAX_VALUE);
Would be nice then for:
Collections.addAll(...)
to ask for proper capacity before adding any elements
Greetings.
W dniu 05-03-2010 10:04 Martin Buchholz <martinrb at google.com> napisał(a):
> Hi Kevin,
> As you've noticed, creating objects within a factor of two of
> their natural limits is a good way to expose lurking bugs.
> I'm the one responsible for the algorithm in ArrayList.
> I'ma bit embarrassed, looking at that code today.
> We could set the array size to Integer.MAX_VALUE,
> but then you might hit an independent buglet in hotspot
> that you cannot allocate an array with Integer.MAX_VALUE
> elements, but Integer.MAX_VALUE - 5 (or so) works.
> It occurs to me that increasing the size by 50% is better done by
> int newCapacity = oldCapacity + (oldCapacity >> 1) + 1;
> I agree with the plan of setting the capacity to something near
> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
> These bugs are not known.
> Chris Hegarty, could you file a bug for us?
> Martin
> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern kevin.l.stern at gmail.com>
> wrote:
> > Greetings,
> >
> > I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
> > java.io.ByteArrayOutputStream which arise when the capacities of the
> data
> > structures reach a particular threshold. More below.
> >
> > When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
> size
> > reaches its capacity and an add or an insert operation is invoked, the
> > capacity is increased by only one element. Notice that in the following
> > excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
> > oldCapacity + 1 unless this value would not suffice to accommodate the
> > required capacity in which case it is set to the required capacity. If
> the
> > current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
> *
> > 3)/2 + 1 overflows and resolves to a negative number resulting in the
> new
> > capacity being set to the required capacity. The major consequence of
> this
> > is that each subsequent add/insert operation results in a full resize
> of the
> > ArrayList causing performance to degrade significantly.
> >
> > int newCapacity = (oldCapacity * 3)/2 + 1;
> > if (newCapacity
> > newCapacity = minCapacity;
> >
> > Hashtable breaks entirely when the size of its backing array reaches
> (1/2) *
> > Integer.MAX_VALUE and a rehash is necessary as is evident from the
> following
> > excerpt from rehash. Notice that rehash will attempt to create an array
> of
> > negative size if the size of the backing array reaches (1/2) *
> > Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
> > negative number.
> >
> > int newCapacity = oldCapacity * 2 + 1;
> > HashtableEntry newTable[] = new HashtableEntry[newCapacity];
> >
> > When the capacity of the backing array in a ByteArrayOutputStream
> reaches
> > (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
> > operation is invoked, the capacity of the backing array is increased
> only by
> > the required number of elements. Notice that in the following excerpt
> from
> > ByteArrayOutputStream.write(int) the new backing array capacity is set
> to 2
> > * buf.length unless this value would not suffice to accommodate the
> required
> > capacity in which case it is set to the required capacity. If the
> current
> > backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
> > buf.length
> > new capacity being set to the required capacity. The major consequence
> of
> > this, like with ArrayList, is that each subsequent write operation
> results
> > in a full resize of the ByteArrayOutputStream causing performance to
> degrade
> > significantly.
> >
> > int newcount = count + 1;
> > if (newcount > buf.length) {
> > buf = Arrays.copyOf(buf, Math.max(buf.length
> > }
> >
> > It is interesting to note that any statements about the amortized time
> > complexity of add/insert operations, such as the one in the ArrayList
> > javadoc, are invalidated by the performance related bugs. One solution
> to
> > the above situations is to set the new capacity of the backing array to
> > Integer.MAX_VALUE when the initial size calculation results in a
> negative
> > number during a resize.
> >
> > Apologies if these bugs are already known.
> >
> > Regards,
> >
> > Kevin
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100305/833b39e4/attachment.html>
More information about the core-libs-dev
mailing list