A List implementation backed by multiple small arrays rather than the traditional single large array.

Tue Apr 13 11:27:15 UTC 2010

Hi Martin,

I had intended to address your request for absolute O(1) operations in the
previous email.  The approach to achieving this suggested in
[Brodnik99resizablearrays] is tantamount to making ArrayList operations
absolute O(1) by keeping around an array of size (3/2)*n and filling it with
a constant number of entries from the main array each time add is called.
Although this distributes the work done during a resize across the n
operations required to enter a resize-required state, it is at the expense
of additional memory usage and slower add operations.  My thought is that
this would be a fine approach for a real-time application that requires hard
guarantees on performance but would be a liability in so many Java
applications that do not require these hard guarantees.  I look forward to
hearing your thoughts on the matter, though.

Kevin

On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:

> Hi Martin,
>
> It's interesting to note that the old circular list trick will not suffice
> to turn this data structure into a deque since we might be copying all n
> elements back to the front = 0 position every n^(1/2) operations (add
> wouldn't amortize to O(1)).  We could use the old two stacks trick (push
> elements onto one stack, flip (the bottom) half (of) the elements to the
> 'other' stack when the 'other' stack becomes empty), mentioned in
> [Brodnik99resizablearrays], but I find this to be a bit CS 101.  In
> [Brodnik99resizablearrays] the authors suggest a method for making all
> blocks roughly the same size, allowing us to expand/shrink capacity at the
> beginning or the end; this is the approach that I will take to create a
> deque.
>
> The FAQ for the Sun Contributor Agreement Q3 (
> http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3)
> indicates that one should check with the project to determine where the SCA
> should be sent.  Do you know where I would find this information?
>
> Kevin
>
> @MISC{Brodnik99resizablearrays,
>     author = {Andrej Brodnik and Svante Carlsson and Erik D. Demaine and J.
> Ian Munro and Robert Sedgewick},
>     title = {Resizable Arrays in Optimal Time and Space},
>     year = {1999}
>
> }
>
> On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <martinrb at google.com>wrote:
>
>> Hi Kevin,
>>
>> Thanks for your continuing work on this.
>>
>> I like the test results, and agree with your analysis.
>> I'm especially happy that you're beating
>> ArrayList at some operations.
>>
>> I'd like to see O(1) addition at the beginning,
>> implement both List and Deque (I regret
>> our not having done this with ArrayDeque).
>>
>> An additional property that would be nice to
>> have (but don't try too hard)
>> is to provide some kind of real-time
>> guarantees on the cost of an individual operation,
>> not just amortized time.  E.g. ArrayList.add
>> is worst-case O(n), making it unsuitable for use
>> in some real-time applications.
>>
>> I will help get your changes into the obvious
>> software distributions.  I assume you're happy
>> with having this class included in any of
>> Doug Lea's jsr166, guava-libraries, or the JDK itself.
>> You should sign a Sun contributor agreement,
>> or whatever the Oracle equivalent is,
>> if you have not done so yet.
>>
>> Doug Lea likes public domain,
>> guava-libraries likes the Apache license.
>>
>> We should get various people a chance to give
>> a thumbs up on the design of this class -
>> Doug Lea, Josh Bloch.
>>
>> Martin
>>
>> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <kevin.l.stern at gmail.com>
>> wrote:
>> > Hello Martin,
>> >
>> > I spent some time this weekend trying to bring out bugs in the
>> > implementation; I believe the latest version to be in decent shape.  I
>> have
>> > also gathered some data on the performance of ChunkedArrayList over
>> > ArrayList using the latest 1.6 JDK, which I've included below (note that
>> the
>> > numbers represent the time spent performing the specified operation with
>> > ChunkedArrayList over the time spent with ArrayList, so 1.00 indicates
>> > equivalent performance, < 1.00 indicates that ChunkedArrayList is less
>> > costly and > 1.00 indicates that ArrayList is less costly).  I've
>> noticed
>> > relatively significant variability in a few of the numbers when I switch
>> > hardware; though, these data do seem to represent rough performance
>> > expectations.  For my test I generated x elements and then timed the
>> process
>> > of adding them to ArrayList/ChunkedArrayList, then I performed a get
>> > operation on each for indices 0 through x-1 and finally I used the
>> iterator
>> > mechanism to retrieve the first through xth element (of course, I
>> performed
>> > each of these operations multiple times throwing away the timing for the
>> > first few iterations to warm up the JVM).
>> >
>> > Regarding the question of whether or not this belongs in java.util, I
>> would
>> > suggest that if it is desirable from a GC point of view to eliminate the
>> > large backing array from ArrayList then your suggestion of achieving
>> this by
>> > way of a data structure that is both time and space optimal is a
>> > particularly elegant solution as it not only guarantees that no backing
>> > array will be larger than sqrt(n) elements but it also provides dynamic
>> > shrinking behavior, has less maximum memory overhead than ArrayList, and
>> > copies (asymptotically) fewer elements during a resize than ArrayList.
>> Of
>> > course, this data structure does not do everything better than
>> ArrayList; in
>> > particular, indexed access is more costly, due to the required
>> decomposition
>> > of the index into backing array index and offset and the additional
>> memory
>> > indirection, and insertion-at-an-index is more costly, due to the
>> multiple
>> > array copies necessary to complete the shift.  That being said, I think
>> that
>> > the additional cost of indexed access is partially mitigated by the
>> > availability of iterator and listIterator, whose implementations do not
>> use
>> > the index decomposition procedure, and the additional cost of
>> > insertion-at-an-index is partially mitigated by the fact that
>> > insertion-at-an-index is already an undesirable operation on ArrayList
>> due
>> > to its linear time complexity.
>> >
>> > Kevin
>> >
>> > 1000000 elements:
>> > Client JVM:
>> > Add to ChunkedArrayList over ArrayList: 1.30
>> > Indexed access ChunkedArrayList over ArrayList: 1.80
>> > Iterator ChunkedArrayList over ArrayList: 0.52
>> >
>> > Server JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.81
>> > Indexed access ChunkedArrayList over ArrayList: 2.87
>> > Iterator ChunkedArrayList over ArrayList: 1.31
>> >
>> > 100000 elements:
>> > Client JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.96
>> > Indexed access ChunkedArrayList over ArrayList: 1.86
>> > Iterator ChunkedArrayList over ArrayList: 0.48
>> >
>> > Server JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.96
>> > Indexed access ChunkedArrayList over ArrayList: 1.89
>> > Iterator ChunkedArrayList over ArrayList: 2.68
>> >
>> > 10000 elements:
>> > Client JVM:
>> > Add to ChunkedArrayList over ArrayList: 1.04
>> > Indexed access ChunkedArrayList over ArrayList: 2.33
>> > Iterator ChunkedArrayList over ArrayList: 0.53
>> >
>> > Server JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.97
>> > Indexed access ChunkedArrayList over ArrayList: 2.45
>> > Iterator ChunkedArrayList over ArrayList: 2.52
>> >
>> > 1000 elements:
>> > Client JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.99
>> > Indexed access ChunkedArrayList over ArrayList: 2.27
>> > Iterator ChunkedArrayList over ArrayList: 0.54
>> >
>> > Server JVM:
>> > Add to ChunkedArrayList over ArrayList: 0.84
>> > Indexed access ChunkedArrayList over ArrayList: 1.23
>> > Iterator ChunkedArrayList over ArrayList: 1.11
>> >
>> >
>> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <martinrb at google.com>
>> wrote:
>> >>
>> >> My feeling on whether to support O(1) at both ends
>> >> is that any flavor of this that ends up in the JDK eventually
>> >> should really do this.  My idea is that we can
>> >> wholeheartedly recommend this collection class
>> >> for overall good behavior without any of the surprising
>> >> performance traps of existing collection classes.
>> >>
>> >> But for the preliminary version, it makes sense to
>> >> support only O(1) at one end, if it simplifies the
>> >> implementation.  Random access will of course
>> >> be worse than ArrayList, but by how much?
>> >> We can do some benchmarking and look for
>> >> micro-optimizations now.
>> >>
>> >> Kevin, what is you own personal feeling?
>> >> Is the algorithm correct, and efficient enough?
>> >> Do you think your new collection belongs in java.util?
>> >>
>> >> Martin
>> >>
>> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <kevin.l.stern at gmail.com>
>> >> wrote:
>> >> > The data structure is available at the second link that I originally
>> >> > provided (once again, it is
>> >> >
>> >> >
>> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
>> ).
>> >> > This does not have O(1) time insertion at the front as yet as it was
>> >> > unclear
>> >> > to me whether or not it was agreed upon:
>> >> > _________________
>> >> > From: Osvaldo Doederlein <opinali at gmail.com>
>> >> > Date: Mon, Mar 29, 2010 at 10:08 AM
>> >> > Subject: Re: A List implementation backed by multiple small arrays
>> >> > rather
>> >> > than the traditional single large array.
>> >> > To: Martin Buchholz <martinrb at google.com>
>> >> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
>> >> > core-libs-dev at openjdk.java.net
>> >> >
>> >> > Initially, it would be good enough to replace only
>> java.util.ArrayList
>> >> > with
>> >> > minimal overhead. ArrayList does not support efficient add-at-front
>> or
>> >> > other
>> >> > enhancements of ArrayDeque; but ArrayList is still a much more
>> important
>> >> > and
>> >> > popular collection, it's the primary "straight replacement for
>> primitive
>> >> > arrrays" and I guess it should continue with that role.
>> >> > _________________
>> >> >
>> >> > As a disclaimer, I'm still tinkering with this so I'll be updating
>> the
>> >> > document at the provided link as I find improvements.
>> >> >
>> >> > Thoughts?
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Kevin
>> >> >
>> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <
>> martinrb at google.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi Kevin,
>> >> >>
>> >> >> You're probably the only one on this list who has
>> >> >> seriously read the paper.  It is not surprising that
>> >> >> taking a research paper into production would
>> >> >> discover bugs - the research never had to undergo
>> >> >> rigorous testing.  (I like the Java culture of
>> >> >> combining spec + implementation + test suite)
>> >> >>
>> >> >> I suggest you ask the authors directly about the bug.
>> >> >> They would probably also be interested to hear
>> >> >> about your implementation.
>> >> >>
>> >> >> Are you aware of Integer.numberOfLeadingZeros?
>> >> >>
>> >> >>
>> >> >>
>> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
>> >> >>
>> >> >> Martin
>> >> >>
>> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <
>> kevin.l.stern at gmail.com>
>> >> >> wrote:
>> >> >> > I'm almost convinced now that the paper is incorrect.  The code
>> below
>> >> >> > gives
>> >> >> > me the appropriate index into the index array and the offset into
>> the
>> >> >> > data
>> >> >> > block.  That being said, remember when I mentioned that this will
>> >> >> > include a
>> >> >> > bit more work to access an element than a simple bit shift and a
>> bit
>> >> >> > mask?
>> >> >> > Well this is more than a bit more - we'll be doing this each time
>> an
>> >> >> > index
>> >> >> > is requested.  I'll spend some time trying to twiddle the bits to
>> see
>> >> >> > if
>> >> >> > I
>> >> >> > can eliminate/combine some of the operations.
>> >> >> >
>> >> >> >         for (int r = 1; r < 33; r++) {
>> >> >> >             int k = lg(r);
>> >> >> >             int floorKO2 = k >> 1;
>> >> >> >             int powFloorKO2 = (1 << floorKO2);
>> >> >> >             int p = ((1 << floorKO2) - 1) << 1;
>> >> >> >             int ceilKO2;
>> >> >> >             if ((k & 1) == 1) {
>> >> >> >                 ceilKO2 = floorKO2 + 1;
>> >> >> >                 p += powFloorKO2;
>> >> >> >             } else {
>> >> >> >                 ceilKO2 = floorKO2;
>> >> >> >             }
>> >> >> >             int e = r & ((1 << ceilKO2) - 1);
>> >> >> >             int b = (r >> ceilKO2) & (powFloorKO2 - 1);
>> >> >> >
>> >> >> >             System.out.println((r - 1) + " " + (p + b) + " " + e);
>> >> >> >         }
>> >> >> >
>> >> >> > Kevin
>> >> >> >
>> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
>> >> >> > <kevin.l.stern at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for even numbered
>> >> >> >> superblocks,
>> >> >> >> the odd numbered superblocks need an additional term added (the
>> >> >> >> number
>> >> >> >> of
>> >> >> >> data blocks in SB_[k-1]) to jive with my interpretation; anyhow,
>> I
>> >> >> >> also
>> >> >> >> came
>> >> >> >> across an alternative characterization of superblock in the paper
>> >> >> >> which
>> >> >> >> states that data blocks are grouped within a superblock when they
>> >> >> >> are
>> >> >> >> the
>> >> >> >> same size - to me, though, that implies that my example structure
>> >> >> >> below
>> >> >> >> would be
>> >> >> >>
>> >> >> >> SB_0: [1]
>> >> >> >> SB_1: [2][2][2]
>> >> >> >> SB_2: [4][4][4][4][4][4]
>> >> >> >>
>> >> >> >> which seems to contradict my understanding of (1) below.  I must
>> be
>> >> >> >> reading this upside down.
>> >> >> >>
>> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
>> >> >> >> <kevin.l.stern at gmail.com>
>> >> >> >> wrote:
>> >> >> >>>
>> >> >> >>> What am I missing here?  In "Resizable arrays in optimal time
>> and
>> >> >> >>> space"
>> >> >> >>> the authors define their data structure with the following
>> >> >> >>> property:
>> >> >> >>>
>> >> >> >>> (1)  "When superblock SB_k is fully allocated, it consists of
>> >> >> >>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))."
>> >> >> >>>
>> >> >> >>> Since the superblock is zero-based indexed this implies the
>> >> >> >>> following
>> >> >> >>> structure:
>> >> >> >>>
>> >> >> >>> SB_0: [1]
>> >> >> >>> SB_1: [2]
>> >> >> >>> SB_2: [2][2]
>> >> >> >>> SB_3: [4][4]
>> >> >> >>> SB_4: [4][4][4][4]
>> >> >> >>> [...]
>> >> >> >>>
>> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with i = 3:
>> >> >> >>>
>> >> >> >>> r = 100 (the binary expansion of i + 1)
>> >> >> >>> k = |r| - 1 = 2
>> >> >> >>> p = 2^k - 1 = 3
>> >> >> >>>
>> >> >> >>> What concerns me is their statement that p represents "the
>> number
>> >> >> >>> of
>> >> >> >>> data
>> >> >> >>> blocks in superblocks prior to SB_k."  There are only two data
>> >> >> >>> blocks
>> >> >> >>> in
>> >> >> >>> superblocks prior to SB_2, not three.  Given (1) above, unless
>> I'm
>> >> >> >>> misinterpreting it, the number of data blocks in superblocks
>> prior
>> >> >> >>> to
>> >> >> >>> SB_k
>> >> >> >>> should be:
>> >> >> >>>
>> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>> >> >> >>>
>> >> >> >>> This, of course, seems to work out much better in my example
>> above,
>> >> >> >>> giving the correct answer to my interpretation of their data
>> >> >> >>> structure, but
>> >> >> >>> I have a hard time believing that this is their mistake rather
>> than
>> >> >> >>> my
>> >> >> >>> misinterpretation.
>> >> >> >>>
>> >> >> >>> Thoughts?
>> >> >> >>>
>> >> >> >>> Kevin
>> >> >> >>>
>> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz
>> >> >> >>> <martinrb at google.com>
>> >> >> >>> wrote:
>> >> >> >>>>
>> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
>> >> >> >>>> <kevin.l.stern at gmail.com>
>> >> >> >>>> wrote:
>> >> >> >>>> > Hi Martin,
>> >> >> >>>> >
>> >> >> >>>> > Thanks much for your feedback.  The first approach that comes
>> to
>> >> >> >>>> > mind
>> >> >> >>>> > to
>> >> >> >>>> > implement O(1) time front as well as rear insertion is to
>> create
>> >> >> >>>> > a
>> >> >> >>>> > cyclic
>> >> >> >>>> > list structure with a front/rear pointer - to insert at the
>> >> >> >>>> > front
>> >> >> >>>> > requires
>> >> >> >>>> > decrementing the front pointer (modulo the size) and to
>> insert
>> >> >> >>>> > at
>> >> >> >>>> > the
>> >> >> >>>> > rear
>> >> >> >>>> > requires incrementing the rear pointer (modulo the size).  We
>> >> >> >>>> > need
>> >> >> >>>> > to
>> >> >> >>>> > resize
>> >> >> >>>> > when the two pointers bump into each other.  Could you
>> explain
>> >> >> >>>> > more
>> >> >> >>>> > about
>> >> >> >>>> > your suggestion of introducing an arraylet that is shared by
>> the
>> >> >> >>>> > front
>> >> >> >>>> > and
>> >> >> >>>> > the rear?
>> >> >> >>>>
>> >> >> >>>> It was a half-baked idea - I don't know if there's a way to
>> turn
>> >> >> >>>> it
>> >> >> >>>> into
>> >> >> >>>> something useful.  I was thinking of the ArrayDeque
>> >> >> >>>> implementation,
>> >> >> >>>> where all the elements live in a single array.
>> >> >> >>>>
>> >> >> >>>> >  It's not clear to me how that would help and/or be a better
>> >> >> >>>> > approach than the cyclic list.  Anyhow, the paper that you
>> >> >> >>>> > reference,
>> >> >> >>>> > "Resizable arrays in optimal time and space", gives a deque
>> so
>> >> >> >>>> > if
>> >> >> >>>> > we
>> >> >> >>>> > take
>> >> >> >>>> > that approach then the deque is specified.
>> >> >> >>>>
>> >> >> >>>> Technically, ArrayList also supports the Deque operations -
>> >> >> >>>> just not efficiently.
>> >> >> >>>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100413/848d812d/attachment.html>