A List implementation backed by multiple small arrays rather than the traditional single large array.

Thu Apr 1 02:34:15 UTC 2010

I'm almost convinced now that the paper is incorrect.  The code below gives
me the appropriate index into the index array and the offset into the data
block.  That being said, remember when I mentioned that this will include a
bit more work to access an element than a simple bit shift and a bit mask?
Well this is more than a bit more - we'll be doing this each time an index
is requested.  I'll spend some time trying to twiddle the bits to see if I
can eliminate/combine some of the operations.

        for (int r = 1; r < 33; r++) {
            int k = lg(r);
            int floorKO2 = k >> 1;
            int powFloorKO2 = (1 << floorKO2);
            int p = ((1 << floorKO2) - 1) << 1;
            int ceilKO2;
            if ((k & 1) == 1) {
                ceilKO2 = floorKO2 + 1;
                p += powFloorKO2;
            } else {
                ceilKO2 = floorKO2;
            }
            int e = r & ((1 << ceilKO2) - 1);
            int b = (r >> ceilKO2) & (powFloorKO2 - 1);

            System.out.println((r - 1) + " " + (p + b) + " " + e);
        }

Kevin

On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> I realize that 2 * (2^(k/2) - 1) only works for even numbered superblocks,
> the odd numbered superblocks need an additional term added (the number of
> data blocks in SB_[k-1]) to jive with my interpretation; anyhow, I also came
> across an alternative characterization of superblock in the paper which
> states that data blocks are grouped within a superblock when they are the
> same size - to me, though, that implies that my example structure below
> would be
>
> SB_0: [1]
> SB_1: [2][2][2]
> SB_2: [4][4][4][4][4][4]
>
> which seems to contradict my understanding of (1) below.  I must be reading
> this upside down.
>
>
> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>
>> What am I missing here?  In "Resizable arrays in optimal time and space"
>> the authors define their data structure with the following property:
>>
>> (1)  "When superblock SB_k is fully allocated, it consists of
>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))."
>>
>> Since the superblock is zero-based indexed this implies the following
>> structure:
>>
>> SB_0: [1]
>> SB_1: [2]
>> SB_2: [2][2]
>> SB_3: [4][4]
>> SB_4: [4][4][4][4]
>> [...]
>>
>> Let's have a look at Algorithm 3, Locate(i), with i = 3:
>>
>> r = 100 (the binary expansion of i + 1)
>> k = |r| - 1 = 2
>> p = 2^k - 1 = 3
>>
>> What concerns me is their statement that p represents "the number of data
>> blocks in superblocks prior to SB_k."  There are only two data blocks in
>> superblocks prior to SB_2, not three.  Given (1) above, unless I'm
>> misinterpreting it, the number of data blocks in superblocks prior to SB_k
>> should be:
>>
>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>
>> This, of course, seems to work out much better in my example above, giving
>> the correct answer to my interpretation of their data structure, but I have
>> a hard time believing that this is their mistake rather than my
>> misinterpretation.
>>
>> Thoughts?
>>
>> Kevin
>>
>>
>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz <martinrb at google.com>wrote:
>>
>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern <kevin.l.stern at gmail.com>
>>> wrote:
>>> > Hi Martin,
>>> >
>>> > Thanks much for your feedback.  The first approach that comes to mind
>>> to
>>> > implement O(1) time front as well as rear insertion is to create a
>>> cyclic
>>> > list structure with a front/rear pointer - to insert at the front
>>> requires
>>> > decrementing the front pointer (modulo the size) and to insert at the
>>> rear
>>> > requires incrementing the rear pointer (modulo the size).  We need to
>>> resize
>>> > when the two pointers bump into each other.  Could you explain more
>>> about
>>> > your suggestion of introducing an arraylet that is shared by the front
>>> and
>>> > the rear?
>>>
>>> It was a half-baked idea - I don't know if there's a way to turn it into
>>> something useful.  I was thinking of the ArrayDeque implementation,
>>> where all the elements live in a single array.
>>>
>>> >  It's not clear to me how that would help and/or be a better
>>> > approach than the cyclic list.  Anyhow, the paper that you reference,
>>> > "Resizable arrays in optimal time and space", gives a deque so if we
>>> take
>>> > that approach then the deque is specified.
>>>
>>> Technically, ArrayList also supports the Deque operations -
>>> just not efficiently.
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100331/0e3f0369/attachment.html>