A List implementation backed by multiple small arrays rather than the traditional single large array.
Kevin L. Stern
kevin.l.stern at gmail.com
Tue Mar 30 11:25:41 UTC 2010
Hi Martin,
Thanks much for your feedback. The first approach that comes to mind to
implement O(1) time front as well as rear insertion is to create a cyclic
list structure with a front/rear pointer - to insert at the front requires
decrementing the front pointer (modulo the size) and to insert at the rear
requires incrementing the rear pointer (modulo the size). We need to resize
when the two pointers bump into each other. Could you explain more about
your suggestion of introducing an arraylet that is shared by the front and
the rear? It's not clear to me how that would help and/or be a better
approach than the cyclic list. Anyhow, the paper that you reference,
"Resizable arrays in optimal time and space", gives a deque so if we take
that approach then the deque is specified.
Shrinking the array is not a problem - this comes 'for free' (in the sense
that it's required) in the optimal space data structure that you reference.
Regarding the gap array suggestion, it is not clear to me how we will still
compute the correct arraylet/offset for an index in O(1) time if we have
arraylets of arbitrary size. Even worse, if we go with the optimal space
data structure we will not have the option of creating arraylets of
arbitrary size or with arbitrary gaps between elements.
You are absolutely right about the n^(1/2) space overhead; I was not aware
of this research. I'll go ahead and implement the structure defined in
"Resizable arrays in optimal time and space" (once I find some time to do
so).
Regards,
Kevin
On Mon, Mar 29, 2010 at 2:23 AM, Martin Buchholz <martinrb at google.com>wrote:
> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > I put together the following class, ChunkedArrayList, in response to
> > Martin's request (excerpted from an earlier conversation on this web
> board)
> > below.
> >
> >
> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
> >
> > Thoughts?
>
> This class is well on the way to what I was thinking of,
> but my bar for acceptance is a little higher.
> In particular, I don't want to add yet another class
> that is can replace some, but not all of existing
> list implementations.
>
> Most obviously, I don't want to lose the ability,
> introduced in ArrayDeque, of having O(1) insertion
> at the front and end of the collection.
> Perhaps you can do this by having one "arraylet"
> always be shared by both ends, which
> grow towards each other in circular fashion.
>
> I also think we should shrink the array when
> necessary, so that occupancy never drops
> below, say 50%.
>
> Perhaps we should also have amortized O(1)
> insertion in the middle by using a "gap array".
> Probably more important for byte/char collections
> like StringBuilder...
>
> I believe there are more complicated implementations
> that permit O(1) insertions at the ends, and only
> O(sqrt(N)) space overhead.
>
> ....
>
> E.g. Use your favorite search engine to do
> some research on:
> Resizable arrays in optimal time and space
> Succinct dynamic data structures
>
> Meta-comment: there is not enough transfer of
> academic research results into practice; I would think this
> is one of the responsibilities of the researchers.
>
> I presume you'd be willing to sign a
> contributor agreement to get your changes into
> the JDK someday.
>
> Martin
>
> > Regards,
> >
> > Kevin
> >
> >
> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
> > wrote:
> >
> > It surely is not a good idea to use a single backing array
> > for huge arrays. As you point out, it's up to 32GB
> > for just one object. But the core JDK
> > doesn't offer a suitable alternative for users who need very
> > large collections.
> >
> > It would have been more in the spirit of Java to have a
> > collection class instead of ArrayList that was not fastest at
> > any particular operation, but had excellent asymptotic behaviour,
> > based on backing arrays containing backing arrays.
> > But:
> > - no such excellent class has been written yet
> > (or please point me to such a class)
> > - even if it were, such a best-of-breed-general-purpose
> > List implementation would probably need to be introduced as a
> > separate class, because of the performance expectations of
> > existing implementations.
> >
> > In the meantime, we have to maintain what we got,
> > and that includes living with arrays and classes that wrap them.
> >
> > Changing the spec is unlikely to succeed..
> >
> > Martin
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/84bfa952/attachment.html>
More information about the core-libs-dev
mailing list