A List implementation backed by multiple small arrays rather than the traditional single large array.

Sun Apr 11 16:32:05 UTC 2010

Hello Martin,

I spent some time this weekend trying to bring out bugs in the
implementation; I believe the latest version to be in decent shape.  I have
also gathered some data on the performance of ChunkedArrayList over
ArrayList using the latest 1.6 JDK, which I've included below (note that the
numbers represent the time spent performing the specified operation with
ChunkedArrayList over the time spent with ArrayList, so 1.00 indicates
equivalent performance, < 1.00 indicates that ChunkedArrayList is less
costly and > 1.00 indicates that ArrayList is less costly).  I've noticed
relatively significant variability in a few of the numbers when I switch
hardware; though, these data do seem to represent rough performance
expectations.  For my test I generated x elements and then timed the process
of adding them to ArrayList/ChunkedArrayList, then I performed a get
operation on each for indices 0 through x-1 and finally I used the iterator
mechanism to retrieve the first through xth element (of course, I performed
each of these operations multiple times throwing away the timing for the
first few iterations to warm up the JVM).

Regarding the question of whether or not this belongs in java.util, I would
suggest that if it is desirable from a GC point of view to eliminate the
large backing array from ArrayList then your suggestion of achieving this by
way of a data structure that is both time and space optimal is a
particularly elegant solution as it not only guarantees that no backing
array will be larger than sqrt(n) elements but it also provides dynamic
shrinking behavior, has less maximum memory overhead than ArrayList, and
copies (asymptotically) fewer elements during a resize than ArrayList.  Of
course, this data structure does not do everything better than ArrayList; in
particular, indexed access is more costly, due to the required decomposition
of the index into backing array index and offset and the additional memory
indirection, and insertion-at-an-index is more costly, due to the multiple
array copies necessary to complete the shift.  That being said, I think that
the additional cost of indexed access is partially mitigated by the
availability of iterator and listIterator, whose implementations do not use
the index decomposition procedure, and the additional cost of
insertion-at-an-index is partially mitigated by the fact that
insertion-at-an-index is already an undesirable operation on ArrayList due
to its linear time complexity.

Kevin

1000000 elements:
Client JVM:
Add to ChunkedArrayList over ArrayList: 1.30
Indexed access ChunkedArrayList over ArrayList: 1.80
Iterator ChunkedArrayList over ArrayList: 0.52

Server JVM:
Add to ChunkedArrayList over ArrayList: 0.81
Indexed access ChunkedArrayList over ArrayList: 2.87
Iterator ChunkedArrayList over ArrayList: 1.31

100000 elements:
Client JVM:
Add to ChunkedArrayList over ArrayList: 0.96
Indexed access ChunkedArrayList over ArrayList: 1.86
Iterator ChunkedArrayList over ArrayList: 0.48

Server JVM:
Add to ChunkedArrayList over ArrayList: 0.96
Indexed access ChunkedArrayList over ArrayList: 1.89
Iterator ChunkedArrayList over ArrayList: 2.68

10000 elements:
Client JVM:
Add to ChunkedArrayList over ArrayList: 1.04
Indexed access ChunkedArrayList over ArrayList: 2.33
Iterator ChunkedArrayList over ArrayList: 0.53

Server JVM:
Add to ChunkedArrayList over ArrayList: 0.97
Indexed access ChunkedArrayList over ArrayList: 2.45
Iterator ChunkedArrayList over ArrayList: 2.52

1000 elements:
Client JVM:
Add to ChunkedArrayList over ArrayList: 0.99
Indexed access ChunkedArrayList over ArrayList: 2.27
Iterator ChunkedArrayList over ArrayList: 0.54

Server JVM:
Add to ChunkedArrayList over ArrayList: 0.84
Indexed access ChunkedArrayList over ArrayList: 1.23
Iterator ChunkedArrayList over ArrayList: 1.11

On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <martinrb at google.com> wrote:

> My feeling on whether to support O(1) at both ends
> is that any flavor of this that ends up in the JDK eventually
> should really do this.  My idea is that we can
> wholeheartedly recommend this collection class
> for overall good behavior without any of the surprising
> performance traps of existing collection classes.
>
> But for the preliminary version, it makes sense to
> support only O(1) at one end, if it simplifies the
> implementation.  Random access will of course
> be worse than ArrayList, but by how much?
> We can do some benchmarking and look for
> micro-optimizations now.
>
> Kevin, what is you own personal feeling?
> Is the algorithm correct, and efficient enough?
> Do you think your new collection belongs in java.util?
>
> Martin
>
> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <kevin.l.stern at gmail.com>
> wrote:
> > The data structure is available at the second link that I originally
> > provided (once again, it is
> >
> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
> ).
> > This does not have O(1) time insertion at the front as yet as it was
> unclear
> > to me whether or not it was agreed upon:
> > _________________
> > From: Osvaldo Doederlein <opinali at gmail.com>
> > Date: Mon, Mar 29, 2010 at 10:08 AM
> > Subject: Re: A List implementation backed by multiple small arrays rather
> > than the traditional single large array.
> > To: Martin Buchholz <martinrb at google.com>
> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
> > core-libs-dev at openjdk.java.net
> >
> > Initially, it would be good enough to replace only java.util.ArrayList
> with
> > minimal overhead. ArrayList does not support efficient add-at-front or
> other
> > enhancements of ArrayDeque; but ArrayList is still a much more important
> and
> > popular collection, it's the primary "straight replacement for primitive
> > arrrays" and I guess it should continue with that role.
> > _________________
> >
> > As a disclaimer, I'm still tinkering with this so I'll be updating the
> > document at the provided link as I find improvements.
> >
> > Thoughts?
> >
> > Thanks,
> >
> > Kevin
> >
> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <martinrb at google.com>
> > wrote:
> >>
> >> Hi Kevin,
> >>
> >> You're probably the only one on this list who has
> >> seriously read the paper.  It is not surprising that
> >> taking a research paper into production would
> >> discover bugs - the research never had to undergo
> >> rigorous testing.  (I like the Java culture of
> >> combining spec + implementation + test suite)
> >>
> >> I suggest you ask the authors directly about the bug.
> >> They would probably also be interested to hear
> >> about your implementation.
> >>
> >> Are you aware of Integer.numberOfLeadingZeros?
> >>
> >>
> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
> >>
> >> Martin
> >>
> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <kevin.l.stern at gmail.com>
> >> wrote:
> >> > I'm almost convinced now that the paper is incorrect.  The code below
> >> > gives
> >> > me the appropriate index into the index array and the offset into the
> >> > data
> >> > block.  That being said, remember when I mentioned that this will
> >> > include a
> >> > bit more work to access an element than a simple bit shift and a bit
> >> > mask?
> >> > Well this is more than a bit more - we'll be doing this each time an
> >> > index
> >> > is requested.  I'll spend some time trying to twiddle the bits to see
> if
> >> > I
> >> > can eliminate/combine some of the operations.
> >> >
> >> >         for (int r = 1; r < 33; r++) {
> >> >             int k = lg(r);
> >> >             int floorKO2 = k >> 1;
> >> >             int powFloorKO2 = (1 << floorKO2);
> >> >             int p = ((1 << floorKO2) - 1) << 1;
> >> >             int ceilKO2;
> >> >             if ((k & 1) == 1) {
> >> >                 ceilKO2 = floorKO2 + 1;
> >> >                 p += powFloorKO2;
> >> >             } else {
> >> >                 ceilKO2 = floorKO2;
> >> >             }
> >> >             int e = r & ((1 << ceilKO2) - 1);
> >> >             int b = (r >> ceilKO2) & (powFloorKO2 - 1);
> >> >
> >> >             System.out.println((r - 1) + " " + (p + b) + " " + e);
> >> >         }
> >> >
> >> > Kevin
> >> >
> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
> >> > <kevin.l.stern at gmail.com>
> >> > wrote:
> >> >>
> >> >> I realize that 2 * (2^(k/2) - 1) only works for even numbered
> >> >> superblocks,
> >> >> the odd numbered superblocks need an additional term added (the
> number
> >> >> of
> >> >> data blocks in SB_[k-1]) to jive with my interpretation; anyhow, I
> also
> >> >> came
> >> >> across an alternative characterization of superblock in the paper
> which
> >> >> states that data blocks are grouped within a superblock when they are
> >> >> the
> >> >> same size - to me, though, that implies that my example structure
> below
> >> >> would be
> >> >>
> >> >> SB_0: [1]
> >> >> SB_1: [2][2][2]
> >> >> SB_2: [4][4][4][4][4][4]
> >> >>
> >> >> which seems to contradict my understanding of (1) below.  I must be
> >> >> reading this upside down.
> >> >>
> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
> >> >> <kevin.l.stern at gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> What am I missing here?  In "Resizable arrays in optimal time and
> >> >>> space"
> >> >>> the authors define their data structure with the following property:
> >> >>>
> >> >>> (1)  "When superblock SB_k is fully allocated, it consists of
> >> >>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))."
> >> >>>
> >> >>> Since the superblock is zero-based indexed this implies the
> following
> >> >>> structure:
> >> >>>
> >> >>> SB_0: [1]
> >> >>> SB_1: [2]
> >> >>> SB_2: [2][2]
> >> >>> SB_3: [4][4]
> >> >>> SB_4: [4][4][4][4]
> >> >>> [...]
> >> >>>
> >> >>> Let's have a look at Algorithm 3, Locate(i), with i = 3:
> >> >>>
> >> >>> r = 100 (the binary expansion of i + 1)
> >> >>> k = |r| - 1 = 2
> >> >>> p = 2^k - 1 = 3
> >> >>>
> >> >>> What concerns me is their statement that p represents "the number of
> >> >>> data
> >> >>> blocks in superblocks prior to SB_k."  There are only two data
> blocks
> >> >>> in
> >> >>> superblocks prior to SB_2, not three.  Given (1) above, unless I'm
> >> >>> misinterpreting it, the number of data blocks in superblocks prior
> to
> >> >>> SB_k
> >> >>> should be:
> >> >>>
> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
> >> >>>
> >> >>> This, of course, seems to work out much better in my example above,
> >> >>> giving the correct answer to my interpretation of their data
> >> >>> structure, but
> >> >>> I have a hard time believing that this is their mistake rather than
> my
> >> >>> misinterpretation.
> >> >>>
> >> >>> Thoughts?
> >> >>>
> >> >>> Kevin
> >> >>>
> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz <
> martinrb at google.com>
> >> >>> wrote:
> >> >>>>
> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
> >> >>>> <kevin.l.stern at gmail.com>
> >> >>>> wrote:
> >> >>>> > Hi Martin,
> >> >>>> >
> >> >>>> > Thanks much for your feedback.  The first approach that comes to
> >> >>>> > mind
> >> >>>> > to
> >> >>>> > implement O(1) time front as well as rear insertion is to create
> a
> >> >>>> > cyclic
> >> >>>> > list structure with a front/rear pointer - to insert at the front
> >> >>>> > requires
> >> >>>> > decrementing the front pointer (modulo the size) and to insert at
> >> >>>> > the
> >> >>>> > rear
> >> >>>> > requires incrementing the rear pointer (modulo the size).  We
> need
> >> >>>> > to
> >> >>>> > resize
> >> >>>> > when the two pointers bump into each other.  Could you explain
> more
> >> >>>> > about
> >> >>>> > your suggestion of introducing an arraylet that is shared by the
> >> >>>> > front
> >> >>>> > and
> >> >>>> > the rear?
> >> >>>>
> >> >>>> It was a half-baked idea - I don't know if there's a way to turn it
> >> >>>> into
> >> >>>> something useful.  I was thinking of the ArrayDeque implementation,
> >> >>>> where all the elements live in a single array.
> >> >>>>
> >> >>>> >  It's not clear to me how that would help and/or be a better
> >> >>>> > approach than the cyclic list.  Anyhow, the paper that you
> >> >>>> > reference,
> >> >>>> > "Resizable arrays in optimal time and space", gives a deque so if
> >> >>>> > we
> >> >>>> > take
> >> >>>> > that approach then the deque is specified.
> >> >>>>
> >> >>>> Technically, ArrayList also supports the Deque operations -
> >> >>>> just not efficiently.
> >> >>>
> >> >>
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100411/1f750f66/attachment.html>