A List implementation backed by multiple small arrays rather than the traditional single large array.

Martin Buchholz martinrb at google.com
Mon Mar 29 07:23:35 UTC 2010


On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> I put together the following class, ChunkedArrayList, in response to
> Martin's request (excerpted from an earlier conversation on this web board)
> below.
>
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>
> Thoughts?

This class is well on the way to what I was thinking of,
but my bar for acceptance is a little higher.
In particular, I don't want to add yet another class
that is can replace some, but not all of existing
list implementations.

Most obviously, I don't want to lose the ability,
introduced in ArrayDeque, of having O(1) insertion
at the front and end of the collection.
Perhaps you can do this by having one "arraylet"
always be shared by both ends, which
grow towards each other in circular fashion.

I also think we should shrink the array when
necessary, so that occupancy never drops
below, say 50%.

Perhaps we should also have amortized O(1)
insertion in the middle by using a "gap array".
Probably more important for byte/char collections
like StringBuilder...

I believe there are more complicated implementations
that permit O(1) insertions at the ends, and only
O(sqrt(N)) space overhead.

....

E.g. Use your favorite search engine to do
some research on:
Resizable arrays in optimal time and space
Succinct dynamic data structures

Meta-comment: there is not enough transfer of
academic research results into practice; I would think this
is one of the responsibilities of the researchers.

I presume you'd be willing to sign a
contributor agreement to get your changes into
the JDK someday.

Martin

> Regards,
>
> Kevin
>
>
> On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> wrote:
>
>     It surely is not a good idea to use a single backing array
>     for huge arrays.  As you point out, it's up to 32GB
>     for just one object.  But the core JDK
>     doesn't offer a suitable alternative for users who need very
>     large collections.
>
>     It would have been more in the spirit of Java to have a
>     collection class instead of ArrayList that was not fastest at
>     any particular operation, but had excellent asymptotic behaviour,
>     based on backing arrays containing backing arrays.
>     But:
>     - no such excellent class has been written yet
>      (or please point me to such a class)
>     - even if it were, such a best-of-breed-general-purpose
>      List implementation would probably need to be introduced as a
>      separate class, because of the performance expectations of
>      existing implementations.
>
>     In the meantime, we have to maintain what we got,
>     and that includes living with arrays and classes that wrap them.
>
>     Changing the spec is unlikely to succeed..
>
>     Martin
>



More information about the core-libs-dev mailing list