A List implementation backed by multiple small arrays rather than the traditional single large array.

Kevin L. Stern kevin.l.stern at gmail.com
Tue Mar 30 11:46:17 UTC 2010


Just to state the obvious, though, operations will be somewhat slower with
the optimal space structure.  Retrieval, for instance, requires more than
simply a shift and a bit mask (although not too much more).

On Tue, Mar 30, 2010 at 6:25 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> Hi Martin,
>
> Thanks much for your feedback.  The first approach that comes to mind to
> implement O(1) time front as well as rear insertion is to create a cyclic
> list structure with a front/rear pointer - to insert at the front requires
> decrementing the front pointer (modulo the size) and to insert at the rear
> requires incrementing the rear pointer (modulo the size).  We need to resize
> when the two pointers bump into each other.  Could you explain more about
> your suggestion of introducing an arraylet that is shared by the front and
> the rear?  It's not clear to me how that would help and/or be a better
> approach than the cyclic list.  Anyhow, the paper that you reference,
> "Resizable arrays in optimal time and space", gives a deque so if we take
> that approach then the deque is specified.
>
> Shrinking the array is not a problem - this comes 'for free' (in the sense
> that it's required) in the optimal space data structure that you reference.
>
> Regarding the gap array suggestion, it is not clear to me how we will still
> compute the correct arraylet/offset for an index in O(1) time if we have
> arraylets of arbitrary size.  Even worse, if we go with the optimal space
> data structure we will not have the option of creating arraylets of
> arbitrary size or with arbitrary gaps between elements.
>
> You are absolutely right about the n^(1/2) space overhead; I was not aware
> of this research.  I'll go ahead and implement the structure defined in
> "Resizable arrays in optimal time and space" (once I find some time to do
> so).
>
> Regards,
>
> Kevin
>
>
> On Mon, Mar 29, 2010 at 2:23 AM, Martin Buchholz <martinrb at google.com>wrote:
>
>> On Sun, Mar 28, 2010 at 04:55, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > I put together the following class, ChunkedArrayList, in response to
>> > Martin's request (excerpted from an earlier conversation on this web
>> board)
>> > below.
>> >
>> >
>> https://docs.google.com/leaf?id=0B6brz3MPBDdhMGNiNGIwMTQtMTgxMi00ODlmLTk4ZGYtOWY2NDE0M2E5M2Zl&sort=name&layout=list&num=50
>> >
>> > Thoughts?
>>
>> This class is well on the way to what I was thinking of,
>> but my bar for acceptance is a little higher.
>> In particular, I don't want to add yet another class
>> that is can replace some, but not all of existing
>> list implementations.
>>
>> Most obviously, I don't want to lose the ability,
>> introduced in ArrayDeque, of having O(1) insertion
>> at the front and end of the collection.
>> Perhaps you can do this by having one "arraylet"
>> always be shared by both ends, which
>> grow towards each other in circular fashion.
>>
>> I also think we should shrink the array when
>> necessary, so that occupancy never drops
>> below, say 50%.
>>
>> Perhaps we should also have amortized O(1)
>> insertion in the middle by using a "gap array".
>> Probably more important for byte/char collections
>> like StringBuilder...
>>
>> I believe there are more complicated implementations
>> that permit O(1) insertions at the ends, and only
>> O(sqrt(N)) space overhead.
>>
>> ....
>>
>> E.g. Use your favorite search engine to do
>> some research on:
>> Resizable arrays in optimal time and space
>> Succinct dynamic data structures
>>
>> Meta-comment: there is not enough transfer of
>> academic research results into practice; I would think this
>> is one of the responsibilities of the researchers.
>>
>> I presume you'd be willing to sign a
>> contributor agreement to get your changes into
>> the JDK someday.
>>
>> Martin
>>
>> > Regards,
>> >
>> > Kevin
>> >
>> >
>> > On Tue, Mar 9, 2010 at 3:15 PM, Martin Buchholz <martinrb at google.com>
>> > wrote:
>> >
>> >     It surely is not a good idea to use a single backing array
>> >     for huge arrays.  As you point out, it's up to 32GB
>> >     for just one object.  But the core JDK
>> >     doesn't offer a suitable alternative for users who need very
>> >     large collections.
>> >
>> >     It would have been more in the spirit of Java to have a
>> >     collection class instead of ArrayList that was not fastest at
>> >     any particular operation, but had excellent asymptotic behaviour,
>> >     based on backing arrays containing backing arrays.
>> >     But:
>> >     - no such excellent class has been written yet
>> >      (or please point me to such a class)
>> >     - even if it were, such a best-of-breed-general-purpose
>> >      List implementation would probably need to be introduced as a
>> >      separate class, because of the performance expectations of
>> >      existing implementations.
>> >
>> >     In the meantime, we have to maintain what we got,
>> >     and that includes living with arrays and classes that wrap them.
>> >
>> >     Changing the spec is unlikely to succeed..
>> >
>> >     Martin
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100330/44d3b34d/attachment.html>


More information about the core-libs-dev mailing list