A List implementation backed by multiple small arrays rather than the traditional single large array.
Martin Buchholz
martinrb at google.com
Fri Apr 2 03:28:18 UTC 2010
Hi Kevin,
You're probably the only one on this list who has
seriously read the paper. It is not surprising that
taking a research paper into production would
discover bugs - the research never had to undergo
rigorous testing. (I like the Java culture of
combining spec + implementation + test suite)
I suggest you ask the authors directly about the bug.
They would probably also be interested to hear
about your implementation.
Are you aware of Integer.numberOfLeadingZeros?
http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)
Martin
On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> I'm almost convinced now that the paper is incorrect. The code below gives
> me the appropriate index into the index array and the offset into the data
> block. That being said, remember when I mentioned that this will include a
> bit more work to access an element than a simple bit shift and a bit mask?
> Well this is more than a bit more - we'll be doing this each time an index
> is requested. I'll spend some time trying to twiddle the bits to see if I
> can eliminate/combine some of the operations.
>
> for (int r = 1; r < 33; r++) {
> int k = lg(r);
> int floorKO2 = k >> 1;
> int powFloorKO2 = (1 << floorKO2);
> int p = ((1 << floorKO2) - 1) << 1;
> int ceilKO2;
> if ((k & 1) == 1) {
> ceilKO2 = floorKO2 + 1;
> p += powFloorKO2;
> } else {
> ceilKO2 = floorKO2;
> }
> int e = r & ((1 << ceilKO2) - 1);
> int b = (r >> ceilKO2) & (powFloorKO2 - 1);
>
> System.out.println((r - 1) + " " + (p + b) + " " + e);
> }
>
> Kevin
>
> On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
>>
>> I realize that 2 * (2^(k/2) - 1) only works for even numbered superblocks,
>> the odd numbered superblocks need an additional term added (the number of
>> data blocks in SB_[k-1]) to jive with my interpretation; anyhow, I also came
>> across an alternative characterization of superblock in the paper which
>> states that data blocks are grouped within a superblock when they are the
>> same size - to me, though, that implies that my example structure below
>> would be
>>
>> SB_0: [1]
>> SB_1: [2][2][2]
>> SB_2: [4][4][4][4][4][4]
>>
>> which seems to contradict my understanding of (1) below. I must be
>> reading this upside down.
>>
>> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>>>
>>> What am I missing here? In "Resizable arrays in optimal time and space"
>>> the authors define their data structure with the following property:
>>>
>>> (1) "When superblock SB_k is fully allocated, it consists of
>>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))."
>>>
>>> Since the superblock is zero-based indexed this implies the following
>>> structure:
>>>
>>> SB_0: [1]
>>> SB_1: [2]
>>> SB_2: [2][2]
>>> SB_3: [4][4]
>>> SB_4: [4][4][4][4]
>>> [...]
>>>
>>> Let's have a look at Algorithm 3, Locate(i), with i = 3:
>>>
>>> r = 100 (the binary expansion of i + 1)
>>> k = |r| - 1 = 2
>>> p = 2^k - 1 = 3
>>>
>>> What concerns me is their statement that p represents "the number of data
>>> blocks in superblocks prior to SB_k." There are only two data blocks in
>>> superblocks prior to SB_2, not three. Given (1) above, unless I'm
>>> misinterpreting it, the number of data blocks in superblocks prior to SB_k
>>> should be:
>>>
>>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>>
>>> This, of course, seems to work out much better in my example above,
>>> giving the correct answer to my interpretation of their data structure, but
>>> I have a hard time believing that this is their mistake rather than my
>>> misinterpretation.
>>>
>>> Thoughts?
>>>
>>> Kevin
>>>
>>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz <martinrb at google.com>
>>> wrote:
>>>>
>>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern <kevin.l.stern at gmail.com>
>>>> wrote:
>>>> > Hi Martin,
>>>> >
>>>> > Thanks much for your feedback. The first approach that comes to mind
>>>> > to
>>>> > implement O(1) time front as well as rear insertion is to create a
>>>> > cyclic
>>>> > list structure with a front/rear pointer - to insert at the front
>>>> > requires
>>>> > decrementing the front pointer (modulo the size) and to insert at the
>>>> > rear
>>>> > requires incrementing the rear pointer (modulo the size). We need to
>>>> > resize
>>>> > when the two pointers bump into each other. Could you explain more
>>>> > about
>>>> > your suggestion of introducing an arraylet that is shared by the front
>>>> > and
>>>> > the rear?
>>>>
>>>> It was a half-baked idea - I don't know if there's a way to turn it into
>>>> something useful. I was thinking of the ArrayDeque implementation,
>>>> where all the elements live in a single array.
>>>>
>>>> > It's not clear to me how that would help and/or be a better
>>>> > approach than the cyclic list. Anyhow, the paper that you reference,
>>>> > "Resizable arrays in optimal time and space", gives a deque so if we
>>>> > take
>>>> > that approach then the deque is specified.
>>>>
>>>> Technically, ArrayList also supports the Deque operations -
>>>> just not efficiently.
>>>
>>
>
>
More information about the core-libs-dev
mailing list