A List implementation backed by multiple small arrays rather than the traditional single large array.

Benedict Elliott Smith lists at laerad.com
Thu Apr 15 09:07:03 UTC 2010


Hi Kevin,

Yes, as I was going to bed last night I realised I had not fully addressed
the problem that was originally being visited; only reduced the constant
factor for addition to the end of the list. A trivial modification fixes
that, however; same scheme but up to some maximum arraylet size (of a power
of 2), after which the array is increased in size linearly. Performance
doesn't seem to have been affected appreciably, although not been exhaustive
in the benchmarking:

10 items inserts versus ArrayList: Chunked=1.15, ExpArray=1.16
10 items inserts Chunked / ExpArray = 0.99
10 items get versus ArrayList: Chunked=1.15, ExpArray=1.16
10 items get Chunked / ExpArray = 0.99
100 items inserts versus ArrayList: Chunked=1.24, ExpArray=1.01
100 items inserts Chunked / ExpArray = 1.23
100 items get versus ArrayList: Chunked=1.24, ExpArray=1.01
100 items get Chunked / ExpArray = 1.23
1000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
1000 items inserts Chunked / ExpArray = 1.19
1000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
1000 items get Chunked / ExpArray = 1.19
10000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
10000 items inserts Chunked / ExpArray = 1.18
10000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
10000 items get Chunked / ExpArray = 1.18
100000 items inserts versus ArrayList: Chunked=0.82, ExpArray=0.75
100000 items inserts Chunked / ExpArray = 1.09
100000 items get versus ArrayList: Chunked=0.82, ExpArray=0.75
100000 items get Chunked / ExpArray = 1.09

The nice thing about this is that the maximum amount of wasted memory is
user configurable. Even with a low setting as above (65K) performance seems
pretty consistent.

Code for calculating index and array offset are pretty straight forward;
haven't given much thought to optimisations just yet:

private final int indexFor(int a, int i) {
return 1 + i - (a > maxArrayIndex ? (1 + a - maxArrayIndex) <<
maxArraySizeShift : 1 << a) ;
}
private final int arrayFor(int i) {
return i >= (maxArraySize << 1) ? (i + 1 >>> maxArraySizeShift) +
maxArrayIndex - 1 : 31 - Integer.numberOfLeadingZeros(i + 1) ;
}

Regarding the double list idea - yes, I agree, I certainly didn't think that
one through fully!



On 15 April 2010 02:44, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:

> Hi Benedict,
>
> Like you, I am relatively new to this mailing list; I am also trying to
> tread lightly so as not to step on any toes.  That being said, I think that
> I can offer a response to your inquiry.
>
> Regarding:  "The idea is to simply double the new array size each time a
> new array needs to be allocated"
>
> It seems this would not address the desire of offering an alternative to
> the allocation of a large backing array for ArrayList (your largest backing
> array could still reach a size of 1/2 * Integer.MAX_VALUE) and would not
> address the desire of wasting the (asymptotically) minimum amount of memory
> in the worst case while maintaining O(1) amortized time bounds.  The data
> structure described in [Brodnik99resizablearrays] has a maximum backing
> array size of sqrt(n) and caps wasted memory at sqrt(n).  What advantage
> over ArrayList do you see in your data structure?
>
> Regarding:  "Also, with regard to a Deque implementation, it seems that the
> simplest solution would be to simply have two lists, with one accepting
> inserts for near the beginning and being ordered in reverse whilst the other
> accepted inserts for near to the end."
>
> What happens with your structure when you add n elements and then remove
> element 0 n times?  I think that once you work out all the kinks you'll end
> up with the two stacks approach, which is mentioned in
> [Brodnik99resizablearrays] and which I mentioned in an earlier email, or
> you'll end up with the circular list approach, which is not friendly to O(1)
> amortized time bounds in a data structure that resizes more often than O(n)
> due to the 'unshift' to the front = 0 position.  I think the best approach
> is the one mentioned in [Brodnik99resizablearrays], which is the approach
> that I am currently working on.  Incidentally, this approach also provides
> for a much improved index unpacking procedure using only bit shifts and bit
> masks, although it is at the expense of (O(1)) additional work during
> resize.
>
> Regards,
>
> Kevin
>
>
>
> On Wed, Apr 14, 2010 at 4:42 PM, Benedict Elliott Smith <lists at laerad.com>wrote:
>
>> Hi,
>>
>> I hope you don't consider it rude to involve myself in this conversation
>> towards the end - I joined the mailing list only recently.
>>
>> I'm not sure if this offers a huge amount to the discussion, but I have
>> tinkered with a "chunked" array list which seems to offer better time
>> performance in general at the cost of greater (worst case) memory
>> utilisation. It is easier to understand IMHO as well, although this is not
>> necessarily a great benefit here. It turns out the idea is very similar to
>> the one implemented already by Kevin, though; but perhaps simpler. The idea
>> is to simply double the new array size each time a new array needs to be
>> allocated, or in effect allocate an array that is the size of all existing
>> arrays put together. With this scheme the calculation for array and offset
>> are really very straight forward ( floor(log(i)) and 1 + i -
>> 2^floor(log(i))) ). Memory utilisation is the same as for ArrayList, but
>> obviously inserts at the end are much quicker.
>>
>> I have prototyped the data structure this evening and benchmarked
>> additions at the end of the list, for which the performance is pretty
>> impressive.
>>
>> Some random statistics for addition only on the client JVM (I have quickly
>> dubbed my implementation ExpArrayList)
>> All statistics were run in two rounds with ~1000 runs per round per
>> statistic per list, and the second round results were used.
>>
>> 10 items versus ArrayList: Chunked=1.14, ExpArray=1.02
>> 10 items Chunked / ExpArray = 1.12
>> 100 items versus ArrayList: Chunked=1.20, ExpArray=0.82
>> 100 items Chunked / ExpArray = 1.45
>> 1000 items versus ArrayList: Chunked=1.03, ExpArray=0.51
>> 1000 items Chunked / ExpArray = 2.02
>> 10000 items versus ArrayList: Chunked=0.88, ExpArray=0.49
>> 10000 items Chunked / ExpArray = 1.79
>> 100000 items versus ArrayList: Chunked=0.32, ExpArray=0.20
>> 100000 items Chunked / ExpArray = 1.64
>>
>> and server JVM:
>> 10 items versus ArrayList: Chunked=1.00, ExpArray=1.16
>> 10 items Chunked / ExpArray = 0.86
>> 100 items versus ArrayList: Chunked=1.29, ExpArray=0.96
>> 100 items Chunked / ExpArray = 1.34
>> 1000 items versus ArrayList: Chunked=1.16, ExpArray=0.92
>> 1000 items Chunked / ExpArray = 1.27
>> 10000 items versus ArrayList: Chunked=0.93, ExpArray=0.84
>> 10000 items Chunked / ExpArray = 1.12
>> 100000 items versus ArrayList: Chunked=0.71, ExpArray=0.65
>> 100000 items Chunked / ExpArray = 1.10
>>
>> Interestingly insertion at the beginning of the list appears to be quicker
>> with ExpArrayList, at least on the server JVM, whereas I would have expected
>> them to be fairly close.
>> Amazingly ExpArrayList is faster even than ArrayList for insertion at the
>> beginning of large lists, which I haven't yet tried to understand. Insertion
>> in the middle is similar.
>>
>> 10 items versus ArrayList: Chunked=9.82, ExpArray=3.80
>> 10 items Chunked / ExpArray = 2.59
>> 100 items versus ArrayList: Chunked=7.30, ExpArray=3.41
>> 100 items Chunked / ExpArray = 2.14
>> 1000 items versus ArrayList: Chunked=2.83, ExpArray=1.09
>> 1000 items Chunked / ExpArray = 2.59
>> 10000 items versus ArrayList: Chunked=1.56, ExpArray=0.72
>> 10000 items Chunked / ExpArray = 2.16
>>
>> Finally, there are promising results for get() from the ExpArrayList as
>> well (server JVM), again somehow beating ArrayList for larger lists:
>> 10 items get versus ArrayList: Chunked=1.27, ExpArray=1.16
>> 10 items get Chunked / ExpArray = 1.10
>> 100 items get versus ArrayList: Chunked=1.45, ExpArray=1.17
>> 100 items get Chunked / ExpArray = 1.25
>> 1000 items get versus ArrayList: Chunked=1.42, ExpArray=1.07
>> 1000 items get Chunked / ExpArray = 1.33
>> 10000 items get versus ArrayList: Chunked=1.26, ExpArray=1.02
>> 10000 items get Chunked / ExpArray = 1.24
>> 100000 items get versus ArrayList: Chunked=1.05, ExpArray=0.86
>> 100000 items get Chunked / ExpArray = 1.22
>>
>>
>> I'm willing to explore this further but I'm not sure how desirable that
>> is, given that Kevin's data structure appears to perform pretty well already
>> wrt to CPU time, and better wrt to memory utilisation, and in effect this
>> mostly changes only the function to determine which array to use, not the
>> body of the implementation. Let me know if you would like a copy of the
>> source code and I will find somewhere to upload it.
>>
>> Also, with regard to a Deque implementation, it seems that the simplest
>> solution would be to simply have two lists, with one accepting inserts for
>> near the beginning and being ordered in reverse whilst the other accepted
>> inserts for near to the end. The only trick would be having the list at the
>> beginning support iteration in reverse order cheaply, but this could easily
>> be achieved by creating an extension of List with a reverseIterator()
>> method.
>>
>>
>> Anyway, not sure if this helped at all but fancied joining in...
>>
>>
>>
>>
>> On 14 April 2010 12:25, Joe Kearney <joe.j.kearney at googlemail.com> wrote:
>>
>>> Hi Kevin,
>>>
>>> It implements List, as well as Deque. It is indeed based on ArrayDeque,
>>> with the added operations to implement list. It does so reasonably
>>> efficiently, moving the fewest elements possible on each operation, that is
>>> zero for the queue operations, at most n/2 for the rest and all of them for
>>> a backing array resize.
>>>
>>> The idea is to get a replacement for arraylist that performs like
>>> arraydeque on remove(0). As a side effect, we should be able to get better
>>> performance on other operations by requiring fewer elements to be moved.
>>>
>>> Thanks,
>>> Joe
>>>
>>> 2010/4/14 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>
>>> Hi Joe,
>>>>
>>>> I was referring to the ChunkedArrayList when I stated that add does not
>>>> amortize to constant time when the data structure employs the circular list
>>>> trick to achieve deque behavior; ChunkedArrayList potentially resizes every
>>>> n^(1/2) operations.
>>>>
>>>> Regarding your CircularArrayList, does it differ from Java's
>>>> ArrayDeque?  I took only a cursory look at it, so please understand if I
>>>> have missed your reason for creating CircularArrayList altogether.
>>>>
>>>> Regards,
>>>>
>>>> Kevin
>>>>
>>>>
>>>> On Tue, Apr 13, 2010 at 6:52 AM, Joe Kearney <
>>>> joe.j.kearney at googlemail.com> wrote:
>>>>
>>>>> Hi Kevin, Martin,
>>>>>
>>>>> To add another discussion point, I've been writing a
>>>>> draft/proof-of-concept of retrofitting the List interface onto ArrayDeque.
>>>>> This works over the raw array, it doesn't use the fancier structures being
>>>>> discussed elsewhere on this list that deal with splitting huge arrays into
>>>>> arraylets, or that provide for O(1) insert in the middle.
>>>>>
>>>>>
>>>>> http://code.google.com/p/libjoe/source/browse/trunk/src/joe/collect/CircularArrayList.java
>>>>>
>>>>> I'd be interested if you have any comments in the context of this
>>>>> discussion. The code is not entirely ready yet, a couple of tests fail
>>>>> (6/789) because of a corner case I haven't nailed yet, but the idea is there
>>>>> at least. I'd like to add array shrinking later, when the size dips below
>>>>> capacity*0.4 perhaps, to avoid flickering up and down around...
>>>>>
>>>>> Tests show performance to be close to ArrayList for the O(1)
>>>>> operations. Timings for indexed reads and writes showed
>>>>> no discernible difference between implementations last time I ran the
>>>>> tests. I don't understand at the moment why the iterator add at index
>>>>> size/3, size/2 perform 30% slower than ArrayList on smaller lists, nor the
>>>>> dodgy numbers for ArrayList.insert(5), I'll look at this soon.  Those
>>>>> operations that become O(1) in a circular implementation (that are
>>>>> implemented and tested here) are faster than in ArrayList. Insert/remove in
>>>>> the middle are somewhat faster than ArrayList because we only have to copy
>>>>> at most half of the elements, except when resizing the array.
>>>>>
>>>>> Kevin, I don't fully understand your point about not amortizing to
>>>>> O(1). Certainly that's true for insert not at head or tail. Otherwise this
>>>>> implementation only moves array elements to the front on an array resize
>>>>> operation which happens every O(ln n) operations at most, if we do lots of
>>>>> adds, maybe a little more if we add array shrinking too.  This is the same
>>>>> as ArrayList. Are you just referring to the add-in-the-middle case?
>>>>>
>>>>> Some performance results below, code for these is in the repository
>>>>> above too. This was the second run, after a warmup.
>>>>>
>>>>> Thanks,
>>>>> Joe
>>>>>
>>>>> ------------------------------------------------ CircularArrayList
>>>>> ------------------------------------------------
>>>>>   size         add         get         set   iterAdd/3   iterAdd/2
>>>>> insert(5)   removeRnd   removeMid   remove(0)
>>>>>      10          20          67          70         125         102
>>>>>        90         240         191         138
>>>>>    100          19          67          70         166         138
>>>>>      94         230         194         118
>>>>>   1000          28          64          67         681         538
>>>>>      91         324         382         119
>>>>>   10000          30          65          67        5884        4425
>>>>>        94        1296        2330         124
>>>>> ---------------------------------------------------- ArrayList
>>>>> ----------------------------------------------------
>>>>>   size         add         get         set   iterAdd/3   iterAdd/2
>>>>> insert(5)   removeRnd   removeMid   remove(0)
>>>>>      10          23          68          70         100          69
>>>>>     32913         162         130         105
>>>>>    100          20          67          70         129         104
>>>>>   21944         169         134         135
>>>>>   1000          29          63          67         651         506
>>>>>    9602         364         333         526
>>>>>   10000          30          63          66        5878        4414
>>>>>      9947        2312        2280        4437
>>>>>
>>>>> 2010/4/13 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>
>>>>> Hi Martin,
>>>>>>
>>>>>> I had intended to address your request for absolute O(1) operations in
>>>>>> the previous email.  The approach to achieving this suggested in
>>>>>> [Brodnik99resizablearrays] is tantamount to making ArrayList operations
>>>>>> absolute O(1) by keeping around an array of size (3/2)*n and filling it with
>>>>>> a constant number of entries from the main array each time add is called.
>>>>>> Although this distributes the work done during a resize across the n
>>>>>> operations required to enter a resize-required state, it is at the expense
>>>>>> of additional memory usage and slower add operations.  My thought is that
>>>>>> this would be a fine approach for a real-time application that requires hard
>>>>>> guarantees on performance but would be a liability in so many Java
>>>>>> applications that do not require these hard guarantees.  I look forward to
>>>>>> hearing your thoughts on the matter, though.
>>>>>>
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <
>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Martin,
>>>>>>>
>>>>>>> It's interesting to note that the old circular list trick will not
>>>>>>> suffice to turn this data structure into a deque since we might be copying
>>>>>>> all n elements back to the front = 0 position every n^(1/2) operations (add
>>>>>>> wouldn't amortize to O(1)).  We could use the old two stacks trick (push
>>>>>>> elements onto one stack, flip (the bottom) half (of) the elements to the
>>>>>>> 'other' stack when the 'other' stack becomes empty), mentioned in
>>>>>>> [Brodnik99resizablearrays], but I find this to be a bit CS 101.  In
>>>>>>> [Brodnik99resizablearrays] the authors suggest a method for making all
>>>>>>> blocks roughly the same size, allowing us to expand/shrink capacity at the
>>>>>>> beginning or the end; this is the approach that I will take to create a
>>>>>>> deque.
>>>>>>>
>>>>>>> The FAQ for the Sun Contributor Agreement Q3 (
>>>>>>> http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3)
>>>>>>> indicates that one should check with the project to determine where the SCA
>>>>>>> should be sent.  Do you know where I would find this information?
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>> @MISC{Brodnik99resizablearrays,
>>>>>>>     author = {Andrej Brodnik and Svante Carlsson and Erik D. Demaine
>>>>>>> and J. Ian Munro and Robert Sedgewick},
>>>>>>>     title = {Resizable Arrays in Optimal Time and Space},
>>>>>>>     year = {1999}
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <
>>>>>>> martinrb at google.com> wrote:
>>>>>>>
>>>>>>>> Hi Kevin,
>>>>>>>>
>>>>>>>> Thanks for your continuing work on this.
>>>>>>>>
>>>>>>>> I like the test results, and agree with your analysis.
>>>>>>>> I'm especially happy that you're beating
>>>>>>>> ArrayList at some operations.
>>>>>>>>
>>>>>>>> I'd like to see O(1) addition at the beginning,
>>>>>>>> implement both List and Deque (I regret
>>>>>>>> our not having done this with ArrayDeque).
>>>>>>>>
>>>>>>>> An additional property that would be nice to
>>>>>>>> have (but don't try too hard)
>>>>>>>> is to provide some kind of real-time
>>>>>>>> guarantees on the cost of an individual operation,
>>>>>>>> not just amortized time.  E.g. ArrayList.add
>>>>>>>> is worst-case O(n), making it unsuitable for use
>>>>>>>> in some real-time applications.
>>>>>>>>
>>>>>>>> I will help get your changes into the obvious
>>>>>>>> software distributions.  I assume you're happy
>>>>>>>> with having this class included in any of
>>>>>>>> Doug Lea's jsr166, guava-libraries, or the JDK itself.
>>>>>>>> You should sign a Sun contributor agreement,
>>>>>>>> or whatever the Oracle equivalent is,
>>>>>>>> if you have not done so yet.
>>>>>>>>
>>>>>>>> Doug Lea likes public domain,
>>>>>>>> guava-libraries likes the Apache license.
>>>>>>>>
>>>>>>>> We should get various people a chance to give
>>>>>>>> a thumbs up on the design of this class -
>>>>>>>> Doug Lea, Josh Bloch.
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <
>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>> > Hello Martin,
>>>>>>>> >
>>>>>>>> > I spent some time this weekend trying to bring out bugs in the
>>>>>>>> > implementation; I believe the latest version to be in decent
>>>>>>>> shape.  I have
>>>>>>>> > also gathered some data on the performance of ChunkedArrayList
>>>>>>>> over
>>>>>>>> > ArrayList using the latest 1.6 JDK, which I've included below
>>>>>>>> (note that the
>>>>>>>> > numbers represent the time spent performing the specified
>>>>>>>> operation with
>>>>>>>> > ChunkedArrayList over the time spent with ArrayList, so 1.00
>>>>>>>> indicates
>>>>>>>> > equivalent performance, < 1.00 indicates that ChunkedArrayList is
>>>>>>>> less
>>>>>>>> > costly and > 1.00 indicates that ArrayList is less costly).  I've
>>>>>>>> noticed
>>>>>>>> > relatively significant variability in a few of the numbers when I
>>>>>>>> switch
>>>>>>>> > hardware; though, these data do seem to represent rough
>>>>>>>> performance
>>>>>>>> > expectations.  For my test I generated x elements and then timed
>>>>>>>> the process
>>>>>>>> > of adding them to ArrayList/ChunkedArrayList, then I performed a
>>>>>>>> get
>>>>>>>> > operation on each for indices 0 through x-1 and finally I used the
>>>>>>>> iterator
>>>>>>>> > mechanism to retrieve the first through xth element (of course, I
>>>>>>>> performed
>>>>>>>> > each of these operations multiple times throwing away the timing
>>>>>>>> for the
>>>>>>>> > first few iterations to warm up the JVM).
>>>>>>>> >
>>>>>>>> > Regarding the question of whether or not this belongs in
>>>>>>>> java.util, I would
>>>>>>>> > suggest that if it is desirable from a GC point of view to
>>>>>>>> eliminate the
>>>>>>>> > large backing array from ArrayList then your suggestion of
>>>>>>>> achieving this by
>>>>>>>> > way of a data structure that is both time and space optimal is a
>>>>>>>> > particularly elegant solution as it not only guarantees that no
>>>>>>>> backing
>>>>>>>> > array will be larger than sqrt(n) elements but it also provides
>>>>>>>> dynamic
>>>>>>>> > shrinking behavior, has less maximum memory overhead than
>>>>>>>> ArrayList, and
>>>>>>>> > copies (asymptotically) fewer elements during a resize than
>>>>>>>> ArrayList.  Of
>>>>>>>> > course, this data structure does not do everything better than
>>>>>>>> ArrayList; in
>>>>>>>> > particular, indexed access is more costly, due to the required
>>>>>>>> decomposition
>>>>>>>> > of the index into backing array index and offset and the
>>>>>>>> additional memory
>>>>>>>> > indirection, and insertion-at-an-index is more costly, due to the
>>>>>>>> multiple
>>>>>>>> > array copies necessary to complete the shift.  That being said, I
>>>>>>>> think that
>>>>>>>> > the additional cost of indexed access is partially mitigated by
>>>>>>>> the
>>>>>>>> > availability of iterator and listIterator, whose implementations
>>>>>>>> do not use
>>>>>>>> > the index decomposition procedure, and the additional cost of
>>>>>>>> > insertion-at-an-index is partially mitigated by the fact that
>>>>>>>> > insertion-at-an-index is already an undesirable operation on
>>>>>>>> ArrayList due
>>>>>>>> > to its linear time complexity.
>>>>>>>> >
>>>>>>>> > Kevin
>>>>>>>> >
>>>>>>>> > 1000000 elements:
>>>>>>>> > Client JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.30
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.80
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.52
>>>>>>>> >
>>>>>>>> > Server JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.81
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.87
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.31
>>>>>>>> >
>>>>>>>> > 100000 elements:
>>>>>>>> > Client JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.86
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.48
>>>>>>>> >
>>>>>>>> > Server JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.89
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.68
>>>>>>>> >
>>>>>>>> > 10000 elements:
>>>>>>>> > Client JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.04
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.33
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.53
>>>>>>>> >
>>>>>>>> > Server JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.97
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.45
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.52
>>>>>>>> >
>>>>>>>> > 1000 elements:
>>>>>>>> > Client JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.99
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.27
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.54
>>>>>>>> >
>>>>>>>> > Server JVM:
>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.84
>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.23
>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.11
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <
>>>>>>>> martinrb at google.com> wrote:
>>>>>>>> >>
>>>>>>>> >> My feeling on whether to support O(1) at both ends
>>>>>>>> >> is that any flavor of this that ends up in the JDK eventually
>>>>>>>> >> should really do this.  My idea is that we can
>>>>>>>> >> wholeheartedly recommend this collection class
>>>>>>>> >> for overall good behavior without any of the surprising
>>>>>>>> >> performance traps of existing collection classes.
>>>>>>>> >>
>>>>>>>> >> But for the preliminary version, it makes sense to
>>>>>>>> >> support only O(1) at one end, if it simplifies the
>>>>>>>> >> implementation.  Random access will of course
>>>>>>>> >> be worse than ArrayList, but by how much?
>>>>>>>> >> We can do some benchmarking and look for
>>>>>>>> >> micro-optimizations now.
>>>>>>>> >>
>>>>>>>> >> Kevin, what is you own personal feeling?
>>>>>>>> >> Is the algorithm correct, and efficient enough?
>>>>>>>> >> Do you think your new collection belongs in java.util?
>>>>>>>> >>
>>>>>>>> >> Martin
>>>>>>>> >>
>>>>>>>> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <
>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>> >> wrote:
>>>>>>>> >> > The data structure is available at the second link that I
>>>>>>>> originally
>>>>>>>> >> > provided (once again, it is
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
>>>>>>>> ).
>>>>>>>> >> > This does not have O(1) time insertion at the front as yet as
>>>>>>>> it was
>>>>>>>> >> > unclear
>>>>>>>> >> > to me whether or not it was agreed upon:
>>>>>>>> >> > _________________
>>>>>>>> >> > From: Osvaldo Doederlein <opinali at gmail.com>
>>>>>>>> >> > Date: Mon, Mar 29, 2010 at 10:08 AM
>>>>>>>> >> > Subject: Re: A List implementation backed by multiple small
>>>>>>>> arrays
>>>>>>>> >> > rather
>>>>>>>> >> > than the traditional single large array.
>>>>>>>> >> > To: Martin Buchholz <martinrb at google.com>
>>>>>>>> >> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
>>>>>>>> >> > core-libs-dev at openjdk.java.net
>>>>>>>> >> >
>>>>>>>> >> > Initially, it would be good enough to replace only
>>>>>>>> java.util.ArrayList
>>>>>>>> >> > with
>>>>>>>> >> > minimal overhead. ArrayList does not support efficient
>>>>>>>> add-at-front or
>>>>>>>> >> > other
>>>>>>>> >> > enhancements of ArrayDeque; but ArrayList is still a much more
>>>>>>>> important
>>>>>>>> >> > and
>>>>>>>> >> > popular collection, it's the primary "straight replacement for
>>>>>>>> primitive
>>>>>>>> >> > arrrays" and I guess it should continue with that role.
>>>>>>>> >> > _________________
>>>>>>>> >> >
>>>>>>>> >> > As a disclaimer, I'm still tinkering with this so I'll be
>>>>>>>> updating the
>>>>>>>> >> > document at the provided link as I find improvements.
>>>>>>>> >> >
>>>>>>>> >> > Thoughts?
>>>>>>>> >> >
>>>>>>>> >> > Thanks,
>>>>>>>> >> >
>>>>>>>> >> > Kevin
>>>>>>>> >> >
>>>>>>>> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <
>>>>>>>> martinrb at google.com>
>>>>>>>> >> > wrote:
>>>>>>>> >> >>
>>>>>>>> >> >> Hi Kevin,
>>>>>>>> >> >>
>>>>>>>> >> >> You're probably the only one on this list who has
>>>>>>>> >> >> seriously read the paper.  It is not surprising that
>>>>>>>> >> >> taking a research paper into production would
>>>>>>>> >> >> discover bugs - the research never had to undergo
>>>>>>>> >> >> rigorous testing.  (I like the Java culture of
>>>>>>>> >> >> combining spec + implementation + test suite)
>>>>>>>> >> >>
>>>>>>>> >> >> I suggest you ask the authors directly about the bug.
>>>>>>>> >> >> They would probably also be interested to hear
>>>>>>>> >> >> about your implementation.
>>>>>>>> >> >>
>>>>>>>> >> >> Are you aware of Integer.numberOfLeadingZeros?
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
>>>>>>>> >> >>
>>>>>>>> >> >> Martin
>>>>>>>> >> >>
>>>>>>>> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <
>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>> >> >> wrote:
>>>>>>>> >> >> > I'm almost convinced now that the paper is incorrect.  The
>>>>>>>> code below
>>>>>>>> >> >> > gives
>>>>>>>> >> >> > me the appropriate index into the index array and the offset
>>>>>>>> into the
>>>>>>>> >> >> > data
>>>>>>>> >> >> > block.  That being said, remember when I mentioned that this
>>>>>>>> will
>>>>>>>> >> >> > include a
>>>>>>>> >> >> > bit more work to access an element than a simple bit shift
>>>>>>>> and a bit
>>>>>>>> >> >> > mask?
>>>>>>>> >> >> > Well this is more than a bit more - we'll be doing this each
>>>>>>>> time an
>>>>>>>> >> >> > index
>>>>>>>> >> >> > is requested.  I'll spend some time trying to twiddle the
>>>>>>>> bits to see
>>>>>>>> >> >> > if
>>>>>>>> >> >> > I
>>>>>>>> >> >> > can eliminate/combine some of the operations.
>>>>>>>> >> >> >
>>>>>>>> >> >> >         for (int r = 1; r < 33; r++) {
>>>>>>>> >> >> >             int k = lg(r);
>>>>>>>> >> >> >             int floorKO2 = k >> 1;
>>>>>>>> >> >> >             int powFloorKO2 = (1 << floorKO2);
>>>>>>>> >> >> >             int p = ((1 << floorKO2) - 1) << 1;
>>>>>>>> >> >> >             int ceilKO2;
>>>>>>>> >> >> >             if ((k & 1) == 1) {
>>>>>>>> >> >> >                 ceilKO2 = floorKO2 + 1;
>>>>>>>> >> >> >                 p += powFloorKO2;
>>>>>>>> >> >> >             } else {
>>>>>>>> >> >> >                 ceilKO2 = floorKO2;
>>>>>>>> >> >> >             }
>>>>>>>> >> >> >             int e = r & ((1 << ceilKO2) - 1);
>>>>>>>> >> >> >             int b = (r >> ceilKO2) & (powFloorKO2 - 1);
>>>>>>>> >> >> >
>>>>>>>> >> >> >             System.out.println((r - 1) + " " + (p + b) + " "
>>>>>>>> + e);
>>>>>>>> >> >> >         }
>>>>>>>> >> >> >
>>>>>>>> >> >> > Kevin
>>>>>>>> >> >> >
>>>>>>>> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
>>>>>>>> >> >> > <kevin.l.stern at gmail.com>
>>>>>>>> >> >> > wrote:
>>>>>>>> >> >> >>
>>>>>>>> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for even
>>>>>>>> numbered
>>>>>>>> >> >> >> superblocks,
>>>>>>>> >> >> >> the odd numbered superblocks need an additional term added
>>>>>>>> (the
>>>>>>>> >> >> >> number
>>>>>>>> >> >> >> of
>>>>>>>> >> >> >> data blocks in SB_[k-1]) to jive with my interpretation;
>>>>>>>> anyhow, I
>>>>>>>> >> >> >> also
>>>>>>>> >> >> >> came
>>>>>>>> >> >> >> across an alternative characterization of superblock in the
>>>>>>>> paper
>>>>>>>> >> >> >> which
>>>>>>>> >> >> >> states that data blocks are grouped within a superblock
>>>>>>>> when they
>>>>>>>> >> >> >> are
>>>>>>>> >> >> >> the
>>>>>>>> >> >> >> same size - to me, though, that implies that my example
>>>>>>>> structure
>>>>>>>> >> >> >> below
>>>>>>>> >> >> >> would be
>>>>>>>> >> >> >>
>>>>>>>> >> >> >> SB_0: [1]
>>>>>>>> >> >> >> SB_1: [2][2][2]
>>>>>>>> >> >> >> SB_2: [4][4][4][4][4][4]
>>>>>>>> >> >> >>
>>>>>>>> >> >> >> which seems to contradict my understanding of (1) below.  I
>>>>>>>> must be
>>>>>>>> >> >> >> reading this upside down.
>>>>>>>> >> >> >>
>>>>>>>> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
>>>>>>>> >> >> >> <kevin.l.stern at gmail.com>
>>>>>>>> >> >> >> wrote:
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> What am I missing here?  In "Resizable arrays in optimal
>>>>>>>> time and
>>>>>>>> >> >> >>> space"
>>>>>>>> >> >> >>> the authors define their data structure with the following
>>>>>>>> >> >> >>> property:
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> (1)  "When superblock SB_k is fully allocated, it consists
>>>>>>>> of
>>>>>>>> >> >> >>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))."
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> Since the superblock is zero-based indexed this implies
>>>>>>>> the
>>>>>>>> >> >> >>> following
>>>>>>>> >> >> >>> structure:
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> SB_0: [1]
>>>>>>>> >> >> >>> SB_1: [2]
>>>>>>>> >> >> >>> SB_2: [2][2]
>>>>>>>> >> >> >>> SB_3: [4][4]
>>>>>>>> >> >> >>> SB_4: [4][4][4][4]
>>>>>>>> >> >> >>> [...]
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with i = 3:
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> r = 100 (the binary expansion of i + 1)
>>>>>>>> >> >> >>> k = |r| - 1 = 2
>>>>>>>> >> >> >>> p = 2^k - 1 = 3
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> What concerns me is their statement that p represents "the
>>>>>>>> number
>>>>>>>> >> >> >>> of
>>>>>>>> >> >> >>> data
>>>>>>>> >> >> >>> blocks in superblocks prior to SB_k."  There are only two
>>>>>>>> data
>>>>>>>> >> >> >>> blocks
>>>>>>>> >> >> >>> in
>>>>>>>> >> >> >>> superblocks prior to SB_2, not three.  Given (1) above,
>>>>>>>> unless I'm
>>>>>>>> >> >> >>> misinterpreting it, the number of data blocks in
>>>>>>>> superblocks prior
>>>>>>>> >> >> >>> to
>>>>>>>> >> >> >>> SB_k
>>>>>>>> >> >> >>> should be:
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> This, of course, seems to work out much better in my
>>>>>>>> example above,
>>>>>>>> >> >> >>> giving the correct answer to my interpretation of their
>>>>>>>> data
>>>>>>>> >> >> >>> structure, but
>>>>>>>> >> >> >>> I have a hard time believing that this is their mistake
>>>>>>>> rather than
>>>>>>>> >> >> >>> my
>>>>>>>> >> >> >>> misinterpretation.
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> Thoughts?
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> Kevin
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz
>>>>>>>> >> >> >>> <martinrb at google.com>
>>>>>>>> >> >> >>> wrote:
>>>>>>>> >> >> >>>>
>>>>>>>> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
>>>>>>>> >> >> >>>> <kevin.l.stern at gmail.com>
>>>>>>>> >> >> >>>> wrote:
>>>>>>>> >> >> >>>> > Hi Martin,
>>>>>>>> >> >> >>>> >
>>>>>>>> >> >> >>>> > Thanks much for your feedback.  The first approach that
>>>>>>>> comes to
>>>>>>>> >> >> >>>> > mind
>>>>>>>> >> >> >>>> > to
>>>>>>>> >> >> >>>> > implement O(1) time front as well as rear insertion is
>>>>>>>> to create
>>>>>>>> >> >> >>>> > a
>>>>>>>> >> >> >>>> > cyclic
>>>>>>>> >> >> >>>> > list structure with a front/rear pointer - to insert at
>>>>>>>> the
>>>>>>>> >> >> >>>> > front
>>>>>>>> >> >> >>>> > requires
>>>>>>>> >> >> >>>> > decrementing the front pointer (modulo the size) and to
>>>>>>>> insert
>>>>>>>> >> >> >>>> > at
>>>>>>>> >> >> >>>> > the
>>>>>>>> >> >> >>>> > rear
>>>>>>>> >> >> >>>> > requires incrementing the rear pointer (modulo the
>>>>>>>> size).  We
>>>>>>>> >> >> >>>> > need
>>>>>>>> >> >> >>>> > to
>>>>>>>> >> >> >>>> > resize
>>>>>>>> >> >> >>>> > when the two pointers bump into each other.  Could you
>>>>>>>> explain
>>>>>>>> >> >> >>>> > more
>>>>>>>> >> >> >>>> > about
>>>>>>>> >> >> >>>> > your suggestion of introducing an arraylet that is
>>>>>>>> shared by the
>>>>>>>> >> >> >>>> > front
>>>>>>>> >> >> >>>> > and
>>>>>>>> >> >> >>>> > the rear?
>>>>>>>> >> >> >>>>
>>>>>>>> >> >> >>>> It was a half-baked idea - I don't know if there's a way
>>>>>>>> to turn
>>>>>>>> >> >> >>>> it
>>>>>>>> >> >> >>>> into
>>>>>>>> >> >> >>>> something useful.  I was thinking of the ArrayDeque
>>>>>>>> >> >> >>>> implementation,
>>>>>>>> >> >> >>>> where all the elements live in a single array.
>>>>>>>> >> >> >>>>
>>>>>>>> >> >> >>>> >  It's not clear to me how that would help and/or be a
>>>>>>>> better
>>>>>>>> >> >> >>>> > approach than the cyclic list.  Anyhow, the paper that
>>>>>>>> you
>>>>>>>> >> >> >>>> > reference,
>>>>>>>> >> >> >>>> > "Resizable arrays in optimal time and space", gives a
>>>>>>>> deque so
>>>>>>>> >> >> >>>> > if
>>>>>>>> >> >> >>>> > we
>>>>>>>> >> >> >>>> > take
>>>>>>>> >> >> >>>> > that approach then the deque is specified.
>>>>>>>> >> >> >>>>
>>>>>>>> >> >> >>>> Technically, ArrayList also supports the Deque operations
>>>>>>>> -
>>>>>>>> >> >> >>>> just not efficiently.
>>>>>>>> >> >> >>>
>>>>>>>> >> >> >>
>>>>>>>> >> >> >
>>>>>>>> >> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100415/88d6510a/attachment.html>


More information about the core-libs-dev mailing list