Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
Ulf Zibis
Ulf.Zibis at gmx.de
Tue Mar 9 11:59:26 UTC 2010
In PriorityQueue:
let's result newCapacity in 0xFFFF.FFFC =-4
then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
then Arrays.copyOf(queue, newCapacity) ---> ArrayIndexOutOfBoundsException
Am I wrong ?
2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???
-Ulf
Am 09.03.2010 03:10, schrieb Martin Buchholz:
> [Chris or Alan, please review and file a bug]
>
> OK, guys,
>
> Here's a patch:
>
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>
> Martin
>
> On Fri, Mar 5, 2010 at 02:48, Kevin L. Stern<kevin.l.stern at gmail.com> wrote:
>
>> Hi Martin,
>>
>> Thank you for your reply. If I may, PriorityQueue appears to employ the
>> simple strategy that I suggested above in its grow method:
>>
>> int newCapacity = ((oldCapacity< 64)?
>> ((oldCapacity + 1) * 2):
>> ((oldCapacity / 2) * 3));
>> if (newCapacity< 0) // overflow
>> newCapacity = Integer.MAX_VALUE;
>>
>> It might be desirable to set a common strategy for capacity increase for all
>> collections.
>>
>> Regards,
>>
>> Kevin
>>
>> On Fri, Mar 5, 2010 at 3:04 AM, Martin Buchholz<martinrb at google.com> wrote:
>>
>>> Hi Kevin,
>>>
>>> As you've noticed, creating objects within a factor of two of
>>> their natural limits is a good way to expose lurking bugs.
>>>
>>> I'm the one responsible for the algorithm in ArrayList.
>>> I'm a bit embarrassed, looking at that code today.
>>> We could set the array size to Integer.MAX_VALUE,
>>> but then you might hit an independent buglet in hotspot
>>> that you cannot allocate an array with Integer.MAX_VALUE
>>> elements, but Integer.MAX_VALUE - 5 (or so) works.
>>>
>>> It occurs to me that increasing the size by 50% is better done by
>>> int newCapacity = oldCapacity + (oldCapacity>> 1) + 1;
>>>
>>> I agree with the plan of setting the capacity to something near
>>> MAX_VALUE on overflow, and throw OutOfMemoryError on next resize.
>>>
>>> These bugs are not known.
>>> Chris Hegarty, could you file a bug for us?
>>>
>>> Martin
>>>
>>> On Wed, Mar 3, 2010 at 17:41, Kevin L. Stern<kevin.l.stern at gmail.com>
>>> wrote:
>>>
>>>> Greetings,
>>>>
>>>> I've noticed bugs in java.util.ArrayList, java.util.Hashtable and
>>>> java.io.ByteArrayOutputStream which arise when the capacities of the
>>>> data
>>>> structures reach a particular threshold. More below.
>>>>
>>>> When the capacity of an ArrayList reaches (2/3)*Integer.MAX_VALUE its
>>>> size
>>>> reaches its capacity and an add or an insert operation is invoked, the
>>>> capacity is increased by only one element. Notice that in the following
>>>> excerpt from ArrayList.ensureCapacity the new capacity is set to (3/2) *
>>>> oldCapacity + 1 unless this value would not suffice to accommodate the
>>>> required capacity in which case it is set to the required capacity. If
>>>> the
>>>> current capacity is at least (2/3)*Integer.MAX_VALUE, then (oldCapacity
>>>> *
>>>> 3)/2 + 1 overflows and resolves to a negative number resulting in the
>>>> new
>>>> capacity being set to the required capacity. The major consequence of
>>>> this
>>>> is that each subsequent add/insert operation results in a full resize of
>>>> the
>>>> ArrayList causing performance to degrade significantly.
>>>>
>>>> int newCapacity = (oldCapacity * 3)/2 + 1;
>>>> if (newCapacity< minCapacity)
>>>> newCapacity = minCapacity;
>>>>
>>>> Hashtable breaks entirely when the size of its backing array reaches
>>>> (1/2) *
>>>> Integer.MAX_VALUE and a rehash is necessary as is evident from the
>>>> following
>>>> excerpt from rehash. Notice that rehash will attempt to create an array
>>>> of
>>>> negative size if the size of the backing array reaches (1/2) *
>>>> Integer.MAX_VALUE since oldCapacity * 2 + 1 overflows and resolves to a
>>>> negative number.
>>>>
>>>> int newCapacity = oldCapacity * 2 + 1;
>>>> HashtableEntry newTable[] = new HashtableEntry[newCapacity];
>>>>
>>>> When the capacity of the backing array in a ByteArrayOutputStream
>>>> reaches
>>>> (1/2) * Integer.MAX_VALUE its size reaches its capacity and a write
>>>> operation is invoked, the capacity of the backing array is increased
>>>> only by
>>>> the required number of elements. Notice that in the following excerpt
>>>> from
>>>> ByteArrayOutputStream.write(int) the new backing array capacity is set
>>>> to 2
>>>> * buf.length unless this value would not suffice to accommodate the
>>>> required
>>>> capacity in which case it is set to the required capacity. If the
>>>> current
>>>> backing array capacity is at least (1/2) * Integer.MAX_VALUE + 1, then
>>>> buf.length<< 1 overflows and resolves to a negative number resulting in
>>>> the
>>>> new capacity being set to the required capacity. The major consequence
>>>> of
>>>> this, like with ArrayList, is that each subsequent write operation
>>>> results
>>>> in a full resize of the ByteArrayOutputStream causing performance to
>>>> degrade
>>>> significantly.
>>>>
>>>> int newcount = count + 1;
>>>> if (newcount> buf.length) {
>>>> buf = Arrays.copyOf(buf, Math.max(buf.length<< 1,
>>>> newcount));
>>>> }
>>>>
>>>> It is interesting to note that any statements about the amortized time
>>>> complexity of add/insert operations, such as the one in the ArrayList
>>>> javadoc, are invalidated by the performance related bugs. One solution
>>>> to
>>>> the above situations is to set the new capacity of the backing array to
>>>> Integer.MAX_VALUE when the initial size calculation results in a
>>>> negative
>>>> number during a resize.
>>>>
>>>> Apologies if these bugs are already known.
>>>>
>>>> Regards,
>>>>
>>>> Kevin
>>>>
>>>>
>>
>>
>
>
More information about the core-libs-dev
mailing list