Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream

Martin Buchholz martinrb at google.com
Tue Mar 9 20:04:06 UTC 2010


On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> I did a quick search and it appears that Java is indeed two's complement
> based.  Nonetheless, please allow me to point out that, in general, this
> type of code worries me since I fully expect that at some point someone will
> come along and do exactly what Dmytro suggested; that is, someone will
> change:
>
> if (a - b > 0)
>
> to
>
> if (a > b)
>
> and the entire ship will sink.  I, personally, like to avoid obscurities
> such as making integer overflow an essential basis for my algorithm unless
> there is a good reason to do so.  I would, in general, prefer to avoid
> overflow altogether and to make the overflow scenario more explicit:
>
> if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
>    // do something
> } else {
>   // do something else
> }

It's a good point.

In ArrayList we cannot do this (or at least not compatibly)
because ensureCapacity is a public API and effectively already
accepts negative numbers as requests for a positive capacity
that cannot be satisfied.

The current API is used like this:

        int newcount = count + len;
        ensureCapacity(newcount);

If you want to avoid overflow, you would need to change
to something less natural like

        ensureCapacity(count, len);
        int newcount = count + len;

Anyways, I'm keeping the overflow-conscious code,
but adding more warning comments,
and "out-lining" huge array creation so that
ArrayList's code now looks like:


    /**
     * Increases the capacity of this <tt>ArrayList</tt> instance, if
     * necessary, to ensure that it can hold at least the number of elements
     * specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    public void ensureCapacity(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

Webrev regenerated.

Martin

> Of course, these are simply my coding preferences and I may very well be
> missing the 'good reason' to take the current approach.
>
> Regards,
>
> Kevin
>
> On Tue, Mar 9, 2010 at 4:38 AM, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
>>
>> These comparisons are essential to the working of Martin's algorithm.  I
>> found them interesting as well, but notice that when the capacity overflows
>> these comparisons will always be false.  That is to say:
>>
>> oldCapacity < minCapacity (given, otherwise we would not be resizing)
>> therefore oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity -
>> minCapacity < oldCapacity
>>
>> So if oldCapacity + (0.5 for ArrayList, else 1) * oldCapacity >
>> Integer.MAX_VALUE, subtracting minCapacity re-overflows back into the
>> positive number realm.
>>
>> That being said, and this is a question/comment to all, I want to point
>> out that this type of code assumes a particular class of orderly overflow
>> behavior.  Is this specified in the Java spec, or will this break on an
>> obscure machine that does not use, say, two's complement arithmetic?
>>
>> Regards,
>>
>> Kevin
>>
>> 2010/3/9 Dmytro Sheyko <dmytro_sheyko at hotmail.com>
>>>
>>> Is there any reason to use comparison like this
>>>
>>> if (newCapacity - minCapacity < 0)
>>>
>>> if (newCapacity - MAX_ARRAY_SIZE > 0) {
>>>
>>> instead of
>>>
>>> if (newCapacity < minCapacity)
>>>
>>> if (newCapacity > MAX_ARRAY_SIZE) {
>>>
>>> Thanks,
>>> Dmytro
>>>
>>> > Date: Mon, 8 Mar 2010 18:10:37 -0800
>>> > Subject: Re: Bugs in java.util.ArrayList, java.util.Hashtable and
>>> > java.io.ByteArrayOutputStream
>>> > From: martinrb at google.com
>>> > To: kevin.l.stern at gmail.com; christopher.hegarty at sun.com;
>>> > alan.bateman at sun.com
>>> > CC: core-libs-dev at openjdk.java.net
>>> >
>>> > [Chris or Alan, please review and file a bug]
>>> >
>>> > OK, guys,
>>> >
>>> > Here's a patch:
>>> >
>>> > http://cr.openjdk.java.net/~martin/webrevs/openjdk7/ArrayResize/
>>> >
>>> > Martin
>>>
>>>
>>> ________________________________
>>> Hotmail: Trusted email with powerful SPAM protection. Sign up now.
>
>



More information about the core-libs-dev mailing list