Bugs in java.util.ArrayList, java.util.Hashtable and java.io.ByteArrayOutputStream
Martin Buchholz
martinrb at google.com
Tue Mar 9 20:15:41 UTC 2010
It surely is not a good idea to use a single backing array
for huge arrays. As you point out, it's up to 32GB
for just one object. But the core JDK
doesn't offer a suitable alternative for users who need very
large collections.
It would have been more in the spirit of Java to have a
collection class instead of ArrayList that was not fastest at
any particular operation, but had excellent asymptotic behaviour,
based on backing arrays containing backing arrays.
But:
- no such excellent class has been written yet
(or please point me to such a class)
- even if it were, such a best-of-breed-general-purpose
List implementation would probably need to be introduced as a
separate class, because of the performance expectations of
existing implementations.
In the meantime, we have to maintain what we got,
and that includes living with arrays and classes that wrap them.
Changing the spec is unlikely to succeed..
Martin
On Tue, Mar 9, 2010 at 12:02, Osvaldo Doederlein <opinali at gmail.com> wrote:
> Should we really consider this a VM bug? I'm not sure that it's a good idea
> to allocate a single object which size exceeds 4Gb (for a byte[] - due to
> the object header and array size field) - even on a 64-bit VM. An array with
> 2^32 elements is impossible, the maximum allowed by the size field is 2^32-1
> which will be just as bad as 2^32-N for any other tiny positive N, for
> algorithms that love arrays of [base-2-] "round" sizes.
>
> And then if this bug is fixed, it may have slightly different variations.
> For a long[] or double[] array, the allocation for the maximum size would
> exceed 32Gb, so it exceeds the maximum heap size for 64-bit HotSpot with
> CompressedOops. (Ok this is an artificial issue because we won't like have a
> 100% free heap, so the only impediment for "new long[2^32-1]" would be the
> array header.)
>
> My suggestion: impose some fixed N (maybe 64, or 0x100, ...), limiting
> arrays to 2^32-N (for ANY element type). The artificial restriction should
> be large enough to fit the object header of any vendor's JVM, plus the
> per-object overhead of any reasonable heap structure. This limit could be
> added to the spec, so the implementation is not a bug anymore :) and it
> would be a portable limit. Otherwise, some app may work reliably on HotSpot
> if it relies on the fact that 2^32-5 positions are possible, but may break
> on some other vendor's JVM where perhaps the implementation limit is 2^32-13
> or something else.
>
> A+
> Osvaldo
>
> 2010/3/9 Martin Buchholz <martinrb at google.com>
>>
>> On Tue, Mar 9, 2010 at 03:59, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
>> > In PriorityQueue:
>> >
>> > let's result newCapacity in 0xFFFF.FFFC =-4
>> > then "if (newCapacity - MAX_ARRAY_SIZE > 0)" ---> false
>> > then Arrays.copyOf(queue, newCapacity) --->
>> > ArrayIndexOutOfBoundsException
>>
>> How could newCapacity ever become -4?
>> Since growth is by 50%. But even 100% looks safe...
>>
>> > Am I wrong ?
>> >
>> > 2.) Why don't you prefer a system-wide constant for MAX_ARRAY_SIZE ???
>>
>> This should never become a public API - it's a bug in the VM.
>>
>> I prefer the duplication of code to creating a new external dependency.
>>
>> Martin
>>
>> > -Ulf
>
>
More information about the core-libs-dev
mailing list