A List implementation backed by multiple small arrays rather than the traditional single large array.
Benedict Elliott Smith
lists at laerad.com
Sat Apr 24 07:24:45 UTC 2010
Hi Kevin,
It looks like this is because ChunkedArrayList creates only one initial
array of size s, whereas my algorithm expects two . Apologies for not
spotting this - the pattern of allocations is identical after this point.
I'll see if it can be modified to support your pattern of allocation.
On 24 April 2010 01:31, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Benedict,
>
> I took a look at your index decomposition routine; it was not working for
> seed = 1 until I made index the query index plus one (similar to my r
> variable) and arrayIndex ((firstArrayOfThisSize + arrayOffset) &
> Integer.MAX_VALUE) - 1 (notice I'm subtracting one). Once I made these
> changes the routine was actually slower than my version (although not by
> much). Let me know if you have a better way to bring your routine in line
> with the arraylet structure.
>
> Kevin
>
>
> On Fri, Apr 23, 2010 at 2:26 PM, Benedict Elliott Smith <lists at laerad.com>wrote:
>
>> Hi Kevin,
>>
>> Unfortunately this week has been pretty hectic, and I haven't had much
>> time to much more than theorise on this topic - and this weekend the weather
>> looks set to be much too nice to stay in doors! It looks like you've made
>> really good progress on the ChunkedArrayDeque - I haven't fully digested it
>> yet, but it looks pretty impressive. I'm not sure (since I misunderstood
>> your list implementation prior to reading it in detail) but I think I may
>> have been toying with a similar growth strategy for a hash map variant since
>> the copies are necessary there anyway.
>>
>> I have taken another look at the index algorithm and have tidied it up as
>> below, and it now supports a seed size of 1; however I am not sure that
>> using a value this small is advisable, given that the overhead for the first
>> array is at least 60 bytes; with a seed size of 1 the first 8 indexes would
>> utilise less than 20% of the allocated memory for data, the first 24 less
>> than 25%. I would have thought a seed of at least 2 and perhaps 3 would be
>> advisable.
>>
>> As an aside, it is worth noting that the indexOffset calculation below can
>> be replaced with index & (_backingArray[arrayIndex].length - 1) , although I
>> am not certain what effect this will have on performance, given that this
>> would be another memory operation to retrieve the length from the array; but
>> it would make the separation of the calculation into functions more straight
>> forward.
>>
>> final int arraySizeShiftMinusSeed = (31 -
>> Integer.numberOfLeadingZeros(index >>> seed)) >>> 1 ;
>> final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
>>
>> final int firstArrayOfThisSize =
>> (1 << arraySizeShiftMinusSeed + 2)
>> - (1 << arraySizeShiftMinusSeed)
>> - 1 - (arraySizeShift >>> 31) ;
>> final int indexRemainder = index - (1 << arraySizeShift <<
>> arraySizeShiftMinusSeed) ;
>>
>> final int arrayOffset = indexRemainder >>> arraySizeShift ;
>> final int arrayIndex = (firstArrayOfThisSize + arrayOffset) &
>> Integer.MAX_VALUE ;
>>
>> final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
>>
>>
>>
>> On 23 April 2010 11:00, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>
>>> Hi Benedict,
>>>
>>> Have you had a chance to get your index decomposition procedure to work
>>> with seed values less than two?
>>>
>>> Kevin
>>>
>>>
>>> On Sat, Apr 17, 2010 at 11:48 AM, Benedict Elliott Smith <
>>> lists at laerad.com> wrote:
>>>
>>>> Hi Kevin,
>>>>
>>>> As it happens I might have something useful still to contribute. As an
>>>> exercise in saving face I revisited the problem to see if I could achieve
>>>> the same complexity bounds as ChunkedArrayList but with a lower overhead. I
>>>> must admit I still didn't fully appreciate how the algorithm in
>>>> ChunkedArrayList worked until I tried to come up with an algorithm with
>>>> similar properties. What I have ended up with is almost identical except
>>>> adds I think a couple of incremental improvements, simply by redefining the
>>>> arrayIndex() method. I should note that I have not yet implemented more than
>>>> a prototype as it seems to me your implementation is excellent already, and
>>>> if it is decided to include my modifications the changes should be modest.
>>>>
>>>> Firstly, (I hope that) what I have produced is a little more CPU
>>>> pipe-line friendly; there is less dependency on immediately preceding
>>>> calculations at each stage (i.e. so more operations should be able to
>>>> proceed simultaneously in the pipeline), and consists exclusively of shifts,
>>>> addition/subtraction and bit-wise (&)ands (except for the conditionals in
>>>> Integer.numberOfLeadingZeros(i)), although the total number of instructions
>>>> is approximately the same.
>>>>
>>>> Secondly, I have modified the algorithm so that a "seed" size can be
>>>> specified (although I expect hard coding a suitable one will ultimately be
>>>> best). Whereas ChunkedArrayList currently requires that the pattern of array
>>>> allocation sizes be [1, 1, 2, 2, 2, 4(..*6), 8(..*12), 16(..*24)] we can now
>>>> support, for some "*s*", [*s*(..*2), 2*s*(..*3), 4*s*(..*6), 8*s*(..*12),
>>>> 16*s*(..*24)] etc. although when put in simple text like that it does
>>>> appear to trivialise the change. The benefit of this, though, is two fold:
>>>> 1) for small n the constant factor is reduced (both CPU and memory wise);
>>>> and 2) the sqrt(n) bounds are reached more quickly also.
>>>>
>>>> As an illustration, consider setting *s* to 4, and assume the backing
>>>> array is size two and doubles in size with each growth; with
>>>> ChunkedArrayList we would resize at i=2, i=6, i=20, i=72; with *s* as 4
>>>> we would instead resize at i=8,i=24,i=80,i=288; the cost at each would be
>>>> some multiple of 2,4,8,16 respectively. As you can see the latter is much
>>>> closer to the sqrt(n) cost - both approach it eventually, but my suggestion
>>>> is to reach it more quickly. This is at the expense of more slowly reaching
>>>> the sqrt(n) wasted memory condition, but given the high constant factor cost
>>>> wrt to memory at this early stage, this seems a very sensible trade off. It
>>>> seems likely this should also have a positive impact on cache performance
>>>> for smaller lists as well.
>>>>
>>>> Finally, after playing with this idea in my head I am confident I can
>>>> extend the core ideas of this data structure to hashing relatively easily,
>>>> getting the the same worst case O(sqrt(n)) insertion cost, and O(sqrt(n))
>>>> wasted memory guarantees. I notice that this case hasn't been addressed yet,
>>>> although I see from Martin's recent mail that this was raised before. Unless
>>>> there are better suggestions for solving the hash table problem I will have
>>>> a go at it as it seems an interesting problem - that is, assuming there are
>>>> no objections?
>>>>
>>>> I'm interested to hear your thoughts. I hope this time I've been a bit
>>>> more considered in what I've put forward, and hence less of a waste of time!
>>>>
>>>> Code snippet for calculation of array index and item offset:
>>>>
>>>> final int arraySizeShiftMinusSeed = ((31 -
>>>> Integer.numberOfLeadingZeros(index >>> seed)) >>> 1) ;
>>>> final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
>>>> final int firstArrayOfThisSize = ((((1 << arraySizeShiftMinusSeed + 3)
>>>> - (1 << arraySizeShiftMinusSeed + 1))) >>> 1) - 1 ;
>>>> final int indexRemainder = index - ((1 << seed) <<
>>>> arraySizeShiftMinusSeed + arraySizeShiftMinusSeed) ;
>>>> final int arrayOffset = indexRemainder >>> arraySizeShift ;
>>>>
>>>> final int arrayIndex = firstArrayOfThisSize + arrayOffset ;
>>>> final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
>>>>
>>>> the first array size will be 1 << seed - 1 (i.e. seed is equal to *s* +
>>>> 1); seed only works for values for 2 or more at this moment, fyi
>>>>
>>>>
>>>>
>>>> On 16 April 2010 00:18, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>>>
>>>>> Oh no worries Benedict, thanks for your interest in the topic. Let me
>>>>> know if you have any other questions or if you have any related ideas or
>>>>> concerns.
>>>>>
>>>>>
>>>>> On Thu, Apr 15, 2010 at 8:00 AM, Benedict Elliott Smith <
>>>>> lists at laerad.com> wrote:
>>>>>
>>>>>> Sorry Kevin - it sounds like I might be being of more hindrance than
>>>>>> help. that part of the discussion was clearly truncated by the time I had
>>>>>> joined the list - I haven't been able to find the history in the archives
>>>>>> either...
>>>>>>
>>>>>> I was just wondering about the worst case cost of add() as described
>>>>>> by your javadoc; admittedly it is optimal with respect to unused memory, but
>>>>>> the worst case cost of an add is still sqrt(n), with a relatively high
>>>>>> constant factor. I had been thinking that once n passed a threshold the cost
>>>>>> of additions in this other structure would become a constant factor,
>>>>>> offering nice algorithmic complexity guarantees for large n; however since
>>>>>> sqrt(Integer.MAX_VALUE) is ~46,000, the maximum size of new array
>>>>>> allocations would have to be unrealistically small (assuming linear cost for
>>>>>> allocation) for this to be the case. It would still be nice to have a data
>>>>>> structure that avoids needing to copy data with each grow, whilst still
>>>>>> maintaining good memory performance.
>>>>>>
>>>>>> That *all* being said, I had been going by your javadoc and emails to
>>>>>> ascertain the behaviour of this class, as I couldn't locate a free copy
>>>>>> of [Brodnik99resizablearrays], and it seems this was a bad idea; as the
>>>>>> sqrt(n) cost appears to be associated with growing the backing array, rather
>>>>>> than with what I assumed to be copying data between arraylets, and it seems
>>>>>> this cost is pretty optimal. That will teach me to post to a list without
>>>>>> getting my facts straight first. The interesting thing is simply that the
>>>>>> constant factor for this implementation still seems to be quite high,
>>>>>> although perhaps that is simply because I was not benchmarking sufficiently
>>>>>> large values of n.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 15 April 2010 12:12, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>>>>>>
>>>>>>> Hi Benedict,
>>>>>>>
>>>>>>> Unless I am misreading your post, this now has a very similar feel to
>>>>>>> the first data structure that I posted to the list. Martin Buchholz then
>>>>>>> pointed out that we can incorporate the ideas from
>>>>>>> [Brodnik99resizablearrays] and reap additional benefits.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Apr 15, 2010 at 4:07 AM, Benedict Elliott Smith <
>>>>>>> lists at laerad.com> wrote:
>>>>>>>
>>>>>>>> Hi Kevin,
>>>>>>>>
>>>>>>>> Yes, as I was going to bed last night I realised I had not fully
>>>>>>>> addressed the problem that was originally being visited; only reduced the
>>>>>>>> constant factor for addition to the end of the list. A trivial modification
>>>>>>>> fixes that, however; same scheme but up to some maximum arraylet size (of a
>>>>>>>> power of 2), after which the array is increased in size linearly.
>>>>>>>> Performance doesn't seem to have been affected appreciably, although not
>>>>>>>> been exhaustive in the benchmarking:
>>>>>>>>
>>>>>>>> 10 items inserts versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>>>> 10 items inserts Chunked / ExpArray = 0.99
>>>>>>>> 10 items get versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>>>> 10 items get Chunked / ExpArray = 0.99
>>>>>>>> 100 items inserts versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>>>> 100 items inserts Chunked / ExpArray = 1.23
>>>>>>>> 100 items get versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>>>> 100 items get Chunked / ExpArray = 1.23
>>>>>>>> 1000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>>> 1000 items inserts Chunked / ExpArray = 1.19
>>>>>>>> 1000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>>> 1000 items get Chunked / ExpArray = 1.19
>>>>>>>> 10000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>>> 10000 items inserts Chunked / ExpArray = 1.18
>>>>>>>> 10000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>>> 10000 items get Chunked / ExpArray = 1.18
>>>>>>>> 100000 items inserts versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>>>> 100000 items inserts Chunked / ExpArray = 1.09
>>>>>>>> 100000 items get versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>>>> 100000 items get Chunked / ExpArray = 1.09
>>>>>>>>
>>>>>>>> The nice thing about this is that the maximum amount of wasted
>>>>>>>> memory is user configurable. Even with a low setting as above (65K)
>>>>>>>> performance seems pretty consistent.
>>>>>>>>
>>>>>>>> Code for calculating index and array offset are pretty straight
>>>>>>>> forward; haven't given much thought to optimisations just yet:
>>>>>>>>
>>>>>>>> private final int indexFor(int a, int i) {
>>>>>>>> return 1 + i - (a > maxArrayIndex ? (1 + a - maxArrayIndex) <<
>>>>>>>> maxArraySizeShift : 1 << a) ;
>>>>>>>> }
>>>>>>>> private final int arrayFor(int i) {
>>>>>>>> return i >= (maxArraySize << 1) ? (i + 1 >>> maxArraySizeShift) +
>>>>>>>> maxArrayIndex - 1 : 31 - Integer.numberOfLeadingZeros(i + 1) ;
>>>>>>>> }
>>>>>>>>
>>>>>>>> Regarding the double list idea - yes, I agree, I certainly didn't
>>>>>>>> think that one through fully!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 15 April 2010 02:44, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Benedict,
>>>>>>>>>
>>>>>>>>> Like you, I am relatively new to this mailing list; I am also
>>>>>>>>> trying to tread lightly so as not to step on any toes. That being said, I
>>>>>>>>> think that I can offer a response to your inquiry.
>>>>>>>>>
>>>>>>>>> Regarding: "The idea is to simply double the new array size each
>>>>>>>>> time a new array needs to be allocated"
>>>>>>>>>
>>>>>>>>> It seems this would not address the desire of offering an
>>>>>>>>> alternative to the allocation of a large backing array for ArrayList (your
>>>>>>>>> largest backing array could still reach a size of 1/2 * Integer.MAX_VALUE)
>>>>>>>>> and would not address the desire of wasting the (asymptotically) minimum
>>>>>>>>> amount of memory in the worst case while maintaining O(1) amortized time
>>>>>>>>> bounds. The data structure described in [Brodnik99resizablearrays] has a
>>>>>>>>> maximum backing array size of sqrt(n) and caps wasted memory at sqrt(n).
>>>>>>>>> What advantage over ArrayList do you see in your data structure?
>>>>>>>>>
>>>>>>>>> Regarding: "Also, with regard to a Deque implementation, it seems
>>>>>>>>> that the simplest solution would be to simply have two lists, with one
>>>>>>>>> accepting inserts for near the beginning and being ordered in reverse whilst
>>>>>>>>> the other accepted inserts for near to the end."
>>>>>>>>>
>>>>>>>>> What happens with your structure when you add n elements and then
>>>>>>>>> remove element 0 n times? I think that once you work out all the kinks
>>>>>>>>> you'll end up with the two stacks approach, which is mentioned in
>>>>>>>>> [Brodnik99resizablearrays] and which I mentioned in an earlier email, or
>>>>>>>>> you'll end up with the circular list approach, which is not friendly to O(1)
>>>>>>>>> amortized time bounds in a data structure that resizes more often than O(n)
>>>>>>>>> due to the 'unshift' to the front = 0 position. I think the best approach
>>>>>>>>> is the one mentioned in [Brodnik99resizablearrays], which is the approach
>>>>>>>>> that I am currently working on. Incidentally, this approach also provides
>>>>>>>>> for a much improved index unpacking procedure using only bit shifts and bit
>>>>>>>>> masks, although it is at the expense of (O(1)) additional work during
>>>>>>>>> resize.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kevin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Apr 14, 2010 at 4:42 PM, Benedict Elliott Smith <
>>>>>>>>> lists at laerad.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I hope you don't consider it rude to involve myself in this
>>>>>>>>>> conversation towards the end - I joined the mailing list only recently.
>>>>>>>>>>
>>>>>>>>>> I'm not sure if this offers a huge amount to the discussion, but I
>>>>>>>>>> have tinkered with a "chunked" array list which seems to offer better time
>>>>>>>>>> performance in general at the cost of greater (worst case) memory
>>>>>>>>>> utilisation. It is easier to understand IMHO as well, although this is not
>>>>>>>>>> necessarily a great benefit here. It turns out the idea is very similar to
>>>>>>>>>> the one implemented already by Kevin, though; but perhaps simpler. The idea
>>>>>>>>>> is to simply double the new array size each time a new array needs to be
>>>>>>>>>> allocated, or in effect allocate an array that is the size of all existing
>>>>>>>>>> arrays put together. With this scheme the calculation for array and offset
>>>>>>>>>> are really very straight forward ( floor(log(i)) and 1 + i -
>>>>>>>>>> 2^floor(log(i))) ). Memory utilisation is the same as for ArrayList, but
>>>>>>>>>> obviously inserts at the end are much quicker.
>>>>>>>>>>
>>>>>>>>>> I have prototyped the data structure this evening and benchmarked
>>>>>>>>>> additions at the end of the list, for which the performance is pretty
>>>>>>>>>> impressive.
>>>>>>>>>>
>>>>>>>>>> Some random statistics for addition only on the client JVM (I have
>>>>>>>>>> quickly dubbed my implementation ExpArrayList)
>>>>>>>>>> All statistics were run in two rounds with ~1000 runs per round
>>>>>>>>>> per statistic per list, and the second round results were used.
>>>>>>>>>>
>>>>>>>>>> 10 items versus ArrayList: Chunked=1.14, ExpArray=1.02
>>>>>>>>>> 10 items Chunked / ExpArray = 1.12
>>>>>>>>>> 100 items versus ArrayList: Chunked=1.20, ExpArray=0.82
>>>>>>>>>> 100 items Chunked / ExpArray = 1.45
>>>>>>>>>> 1000 items versus ArrayList: Chunked=1.03, ExpArray=0.51
>>>>>>>>>> 1000 items Chunked / ExpArray = 2.02
>>>>>>>>>> 10000 items versus ArrayList: Chunked=0.88, ExpArray=0.49
>>>>>>>>>> 10000 items Chunked / ExpArray = 1.79
>>>>>>>>>> 100000 items versus ArrayList: Chunked=0.32, ExpArray=0.20
>>>>>>>>>> 100000 items Chunked / ExpArray = 1.64
>>>>>>>>>>
>>>>>>>>>> and server JVM:
>>>>>>>>>> 10 items versus ArrayList: Chunked=1.00, ExpArray=1.16
>>>>>>>>>> 10 items Chunked / ExpArray = 0.86
>>>>>>>>>> 100 items versus ArrayList: Chunked=1.29, ExpArray=0.96
>>>>>>>>>> 100 items Chunked / ExpArray = 1.34
>>>>>>>>>> 1000 items versus ArrayList: Chunked=1.16, ExpArray=0.92
>>>>>>>>>> 1000 items Chunked / ExpArray = 1.27
>>>>>>>>>> 10000 items versus ArrayList: Chunked=0.93, ExpArray=0.84
>>>>>>>>>> 10000 items Chunked / ExpArray = 1.12
>>>>>>>>>> 100000 items versus ArrayList: Chunked=0.71, ExpArray=0.65
>>>>>>>>>> 100000 items Chunked / ExpArray = 1.10
>>>>>>>>>>
>>>>>>>>>> Interestingly insertion at the beginning of the list appears to be
>>>>>>>>>> quicker with ExpArrayList, at least on the server JVM, whereas I would have
>>>>>>>>>> expected them to be fairly close.
>>>>>>>>>> Amazingly ExpArrayList is faster even than ArrayList for insertion
>>>>>>>>>> at the beginning of large lists, which I haven't yet tried to understand.
>>>>>>>>>> Insertion in the middle is similar.
>>>>>>>>>>
>>>>>>>>>> 10 items versus ArrayList: Chunked=9.82, ExpArray=3.80
>>>>>>>>>> 10 items Chunked / ExpArray = 2.59
>>>>>>>>>> 100 items versus ArrayList: Chunked=7.30, ExpArray=3.41
>>>>>>>>>> 100 items Chunked / ExpArray = 2.14
>>>>>>>>>> 1000 items versus ArrayList: Chunked=2.83, ExpArray=1.09
>>>>>>>>>> 1000 items Chunked / ExpArray = 2.59
>>>>>>>>>> 10000 items versus ArrayList: Chunked=1.56, ExpArray=0.72
>>>>>>>>>> 10000 items Chunked / ExpArray = 2.16
>>>>>>>>>>
>>>>>>>>>> Finally, there are promising results for get() from the
>>>>>>>>>> ExpArrayList as well (server JVM), again somehow beating ArrayList for
>>>>>>>>>> larger lists:
>>>>>>>>>> 10 items get versus ArrayList: Chunked=1.27, ExpArray=1.16
>>>>>>>>>> 10 items get Chunked / ExpArray = 1.10
>>>>>>>>>> 100 items get versus ArrayList: Chunked=1.45, ExpArray=1.17
>>>>>>>>>> 100 items get Chunked / ExpArray = 1.25
>>>>>>>>>> 1000 items get versus ArrayList: Chunked=1.42, ExpArray=1.07
>>>>>>>>>> 1000 items get Chunked / ExpArray = 1.33
>>>>>>>>>> 10000 items get versus ArrayList: Chunked=1.26, ExpArray=1.02
>>>>>>>>>> 10000 items get Chunked / ExpArray = 1.24
>>>>>>>>>> 100000 items get versus ArrayList: Chunked=1.05, ExpArray=0.86
>>>>>>>>>> 100000 items get Chunked / ExpArray = 1.22
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm willing to explore this further but I'm not sure how desirable
>>>>>>>>>> that is, given that Kevin's data structure appears to perform pretty well
>>>>>>>>>> already wrt to CPU time, and better wrt to memory utilisation, and in effect
>>>>>>>>>> this mostly changes only the function to determine which array to use, not
>>>>>>>>>> the body of the implementation. Let me know if you would like a copy of the
>>>>>>>>>> source code and I will find somewhere to upload it.
>>>>>>>>>>
>>>>>>>>>> Also, with regard to a Deque implementation, it seems that the
>>>>>>>>>> simplest solution would be to simply have two lists, with one accepting
>>>>>>>>>> inserts for near the beginning and being ordered in reverse whilst the other
>>>>>>>>>> accepted inserts for near to the end. The only trick would be having the
>>>>>>>>>> list at the beginning support iteration in reverse order cheaply, but this
>>>>>>>>>> could easily be achieved by creating an extension of List with a
>>>>>>>>>> reverseIterator() method.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Anyway, not sure if this helped at all but fancied joining in...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 14 April 2010 12:25, Joe Kearney <joe.j.kearney at googlemail.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>
>>>>>>>>>>> It implements List, as well as Deque. It is indeed based on
>>>>>>>>>>> ArrayDeque, with the added operations to implement list. It does so
>>>>>>>>>>> reasonably efficiently, moving the fewest elements possible on each
>>>>>>>>>>> operation, that is zero for the queue operations, at most n/2 for the rest
>>>>>>>>>>> and all of them for a backing array resize.
>>>>>>>>>>>
>>>>>>>>>>> The idea is to get a replacement for arraylist that performs like
>>>>>>>>>>> arraydeque on remove(0). As a side effect, we should be able to get better
>>>>>>>>>>> performance on other operations by requiring fewer elements to be moved.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Joe
>>>>>>>>>>>
>>>>>>>>>>> 2010/4/14 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> Hi Joe,
>>>>>>>>>>>>
>>>>>>>>>>>> I was referring to the ChunkedArrayList when I stated that add
>>>>>>>>>>>> does not amortize to constant time when the data structure employs the
>>>>>>>>>>>> circular list trick to achieve deque behavior; ChunkedArrayList potentially
>>>>>>>>>>>> resizes every n^(1/2) operations.
>>>>>>>>>>>>
>>>>>>>>>>>> Regarding your CircularArrayList, does it differ from Java's
>>>>>>>>>>>> ArrayDeque? I took only a cursory look at it, so please understand if I
>>>>>>>>>>>> have missed your reason for creating CircularArrayList altogether.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 13, 2010 at 6:52 AM, Joe Kearney <
>>>>>>>>>>>> joe.j.kearney at googlemail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Kevin, Martin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> To add another discussion point, I've been writing a
>>>>>>>>>>>>> draft/proof-of-concept of retrofitting the List interface onto ArrayDeque.
>>>>>>>>>>>>> This works over the raw array, it doesn't use the fancier structures being
>>>>>>>>>>>>> discussed elsewhere on this list that deal with splitting huge arrays into
>>>>>>>>>>>>> arraylets, or that provide for O(1) insert in the middle.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://code.google.com/p/libjoe/source/browse/trunk/src/joe/collect/CircularArrayList.java
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd be interested if you have any comments in the context of
>>>>>>>>>>>>> this discussion. The code is not entirely ready yet, a couple of tests fail
>>>>>>>>>>>>> (6/789) because of a corner case I haven't nailed yet, but the idea is there
>>>>>>>>>>>>> at least. I'd like to add array shrinking later, when the size dips below
>>>>>>>>>>>>> capacity*0.4 perhaps, to avoid flickering up and down around...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tests show performance to be close to ArrayList for the O(1)
>>>>>>>>>>>>> operations. Timings for indexed reads and writes showed
>>>>>>>>>>>>> no discernible difference between implementations last time I ran the
>>>>>>>>>>>>> tests. I don't understand at the moment why the iterator add at index
>>>>>>>>>>>>> size/3, size/2 perform 30% slower than ArrayList on smaller lists, nor the
>>>>>>>>>>>>> dodgy numbers for ArrayList.insert(5), I'll look at this soon. Those
>>>>>>>>>>>>> operations that become O(1) in a circular implementation (that are
>>>>>>>>>>>>> implemented and tested here) are faster than in ArrayList. Insert/remove in
>>>>>>>>>>>>> the middle are somewhat faster than ArrayList because we only have to copy
>>>>>>>>>>>>> at most half of the elements, except when resizing the array.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kevin, I don't fully understand your point about not amortizing
>>>>>>>>>>>>> to O(1). Certainly that's true for insert not at head or tail. Otherwise
>>>>>>>>>>>>> this implementation only moves array elements to the front on an array
>>>>>>>>>>>>> resize operation which happens every O(ln n) operations at most, if we do
>>>>>>>>>>>>> lots of adds, maybe a little more if we add array shrinking too. This is
>>>>>>>>>>>>> the same as ArrayList. Are you just referring to the add-in-the-middle case?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Some performance results below, code for these is in the
>>>>>>>>>>>>> repository above too. This was the second run, after a warmup.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Joe
>>>>>>>>>>>>>
>>>>>>>>>>>>> ------------------------------------------------
>>>>>>>>>>>>> CircularArrayList ------------------------------------------------
>>>>>>>>>>>>> size add get set iterAdd/3
>>>>>>>>>>>>> iterAdd/2 insert(5) removeRnd removeMid remove(0)
>>>>>>>>>>>>> 10 20 67 70 125
>>>>>>>>>>>>> 102 90 240 191 138
>>>>>>>>>>>>> 100 19 67 70 166
>>>>>>>>>>>>> 138 94 230 194 118
>>>>>>>>>>>>> 1000 28 64 67 681
>>>>>>>>>>>>> 538 91 324 382 119
>>>>>>>>>>>>> 10000 30 65 67 5884
>>>>>>>>>>>>> 4425 94 1296 2330 124
>>>>>>>>>>>>> ---------------------------------------------------- ArrayList
>>>>>>>>>>>>> ----------------------------------------------------
>>>>>>>>>>>>> size add get set iterAdd/3
>>>>>>>>>>>>> iterAdd/2 insert(5) removeRnd removeMid remove(0)
>>>>>>>>>>>>> 10 23 68 70 100
>>>>>>>>>>>>> 69 32913 162 130 105
>>>>>>>>>>>>> 100 20 67 70 129
>>>>>>>>>>>>> 104 21944 169 134 135
>>>>>>>>>>>>> 1000 29 63 67 651
>>>>>>>>>>>>> 506 9602 364 333 526
>>>>>>>>>>>>> 10000 30 63 66 5878
>>>>>>>>>>>>> 4414 9947 2312 2280 4437
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2010/4/13 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I had intended to address your request for absolute O(1)
>>>>>>>>>>>>>> operations in the previous email. The approach to achieving this suggested
>>>>>>>>>>>>>> in [Brodnik99resizablearrays] is tantamount to making ArrayList operations
>>>>>>>>>>>>>> absolute O(1) by keeping around an array of size (3/2)*n and filling it with
>>>>>>>>>>>>>> a constant number of entries from the main array each time add is called.
>>>>>>>>>>>>>> Although this distributes the work done during a resize across the n
>>>>>>>>>>>>>> operations required to enter a resize-required state, it is at the expense
>>>>>>>>>>>>>> of additional memory usage and slower add operations. My thought is that
>>>>>>>>>>>>>> this would be a fine approach for a real-time application that requires hard
>>>>>>>>>>>>>> guarantees on performance but would be a liability in so many Java
>>>>>>>>>>>>>> applications that do not require these hard guarantees. I look forward to
>>>>>>>>>>>>>> hearing your thoughts on the matter, though.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <
>>>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's interesting to note that the old circular list trick
>>>>>>>>>>>>>>> will not suffice to turn this data structure into a deque since we might be
>>>>>>>>>>>>>>> copying all n elements back to the front = 0 position every n^(1/2)
>>>>>>>>>>>>>>> operations (add wouldn't amortize to O(1)). We could use the old two stacks
>>>>>>>>>>>>>>> trick (push elements onto one stack, flip (the bottom) half (of) the
>>>>>>>>>>>>>>> elements to the 'other' stack when the 'other' stack becomes empty),
>>>>>>>>>>>>>>> mentioned in [Brodnik99resizablearrays], but I find this to be a bit CS
>>>>>>>>>>>>>>> 101. In [Brodnik99resizablearrays] the authors suggest a method for making
>>>>>>>>>>>>>>> all blocks roughly the same size, allowing us to expand/shrink capacity at
>>>>>>>>>>>>>>> the beginning or the end; this is the approach that I will take to create a
>>>>>>>>>>>>>>> deque.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The FAQ for the Sun Contributor Agreement Q3 (
>>>>>>>>>>>>>>> http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3)
>>>>>>>>>>>>>>> indicates that one should check with the project to determine where the SCA
>>>>>>>>>>>>>>> should be sent. Do you know where I would find this information?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @MISC{Brodnik99resizablearrays,
>>>>>>>>>>>>>>> author = {Andrej Brodnik and Svante Carlsson and Erik D.
>>>>>>>>>>>>>>> Demaine and J. Ian Munro and Robert Sedgewick},
>>>>>>>>>>>>>>> title = {Resizable Arrays in Optimal Time and Space},
>>>>>>>>>>>>>>> year = {1999}
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <
>>>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for your continuing work on this.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I like the test results, and agree with your analysis.
>>>>>>>>>>>>>>>> I'm especially happy that you're beating
>>>>>>>>>>>>>>>> ArrayList at some operations.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'd like to see O(1) addition at the beginning,
>>>>>>>>>>>>>>>> implement both List and Deque (I regret
>>>>>>>>>>>>>>>> our not having done this with ArrayDeque).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> An additional property that would be nice to
>>>>>>>>>>>>>>>> have (but don't try too hard)
>>>>>>>>>>>>>>>> is to provide some kind of real-time
>>>>>>>>>>>>>>>> guarantees on the cost of an individual operation,
>>>>>>>>>>>>>>>> not just amortized time. E.g. ArrayList.add
>>>>>>>>>>>>>>>> is worst-case O(n), making it unsuitable for use
>>>>>>>>>>>>>>>> in some real-time applications.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I will help get your changes into the obvious
>>>>>>>>>>>>>>>> software distributions. I assume you're happy
>>>>>>>>>>>>>>>> with having this class included in any of
>>>>>>>>>>>>>>>> Doug Lea's jsr166, guava-libraries, or the JDK itself.
>>>>>>>>>>>>>>>> You should sign a Sun contributor agreement,
>>>>>>>>>>>>>>>> or whatever the Oracle equivalent is,
>>>>>>>>>>>>>>>> if you have not done so yet.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Doug Lea likes public domain,
>>>>>>>>>>>>>>>> guava-libraries likes the Apache license.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We should get various people a chance to give
>>>>>>>>>>>>>>>> a thumbs up on the design of this class -
>>>>>>>>>>>>>>>> Doug Lea, Josh Bloch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <
>>>>>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>>>>> > Hello Martin,
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > I spent some time this weekend trying to bring out bugs in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> > implementation; I believe the latest version to be in
>>>>>>>>>>>>>>>> decent shape. I have
>>>>>>>>>>>>>>>> > also gathered some data on the performance of
>>>>>>>>>>>>>>>> ChunkedArrayList over
>>>>>>>>>>>>>>>> > ArrayList using the latest 1.6 JDK, which I've included
>>>>>>>>>>>>>>>> below (note that the
>>>>>>>>>>>>>>>> > numbers represent the time spent performing the specified
>>>>>>>>>>>>>>>> operation with
>>>>>>>>>>>>>>>> > ChunkedArrayList over the time spent with ArrayList, so
>>>>>>>>>>>>>>>> 1.00 indicates
>>>>>>>>>>>>>>>> > equivalent performance, < 1.00 indicates that
>>>>>>>>>>>>>>>> ChunkedArrayList is less
>>>>>>>>>>>>>>>> > costly and > 1.00 indicates that ArrayList is less
>>>>>>>>>>>>>>>> costly). I've noticed
>>>>>>>>>>>>>>>> > relatively significant variability in a few of the numbers
>>>>>>>>>>>>>>>> when I switch
>>>>>>>>>>>>>>>> > hardware; though, these data do seem to represent rough
>>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>> > expectations. For my test I generated x elements and then
>>>>>>>>>>>>>>>> timed the process
>>>>>>>>>>>>>>>> > of adding them to ArrayList/ChunkedArrayList, then I
>>>>>>>>>>>>>>>> performed a get
>>>>>>>>>>>>>>>> > operation on each for indices 0 through x-1 and finally I
>>>>>>>>>>>>>>>> used the iterator
>>>>>>>>>>>>>>>> > mechanism to retrieve the first through xth element (of
>>>>>>>>>>>>>>>> course, I performed
>>>>>>>>>>>>>>>> > each of these operations multiple times throwing away the
>>>>>>>>>>>>>>>> timing for the
>>>>>>>>>>>>>>>> > first few iterations to warm up the JVM).
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Regarding the question of whether or not this belongs in
>>>>>>>>>>>>>>>> java.util, I would
>>>>>>>>>>>>>>>> > suggest that if it is desirable from a GC point of view to
>>>>>>>>>>>>>>>> eliminate the
>>>>>>>>>>>>>>>> > large backing array from ArrayList then your suggestion of
>>>>>>>>>>>>>>>> achieving this by
>>>>>>>>>>>>>>>> > way of a data structure that is both time and space
>>>>>>>>>>>>>>>> optimal is a
>>>>>>>>>>>>>>>> > particularly elegant solution as it not only guarantees
>>>>>>>>>>>>>>>> that no backing
>>>>>>>>>>>>>>>> > array will be larger than sqrt(n) elements but it also
>>>>>>>>>>>>>>>> provides dynamic
>>>>>>>>>>>>>>>> > shrinking behavior, has less maximum memory overhead than
>>>>>>>>>>>>>>>> ArrayList, and
>>>>>>>>>>>>>>>> > copies (asymptotically) fewer elements during a resize
>>>>>>>>>>>>>>>> than ArrayList. Of
>>>>>>>>>>>>>>>> > course, this data structure does not do everything better
>>>>>>>>>>>>>>>> than ArrayList; in
>>>>>>>>>>>>>>>> > particular, indexed access is more costly, due to the
>>>>>>>>>>>>>>>> required decomposition
>>>>>>>>>>>>>>>> > of the index into backing array index and offset and the
>>>>>>>>>>>>>>>> additional memory
>>>>>>>>>>>>>>>> > indirection, and insertion-at-an-index is more costly, due
>>>>>>>>>>>>>>>> to the multiple
>>>>>>>>>>>>>>>> > array copies necessary to complete the shift. That being
>>>>>>>>>>>>>>>> said, I think that
>>>>>>>>>>>>>>>> > the additional cost of indexed access is partially
>>>>>>>>>>>>>>>> mitigated by the
>>>>>>>>>>>>>>>> > availability of iterator and listIterator, whose
>>>>>>>>>>>>>>>> implementations do not use
>>>>>>>>>>>>>>>> > the index decomposition procedure, and the additional cost
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> > insertion-at-an-index is partially mitigated by the fact
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> > insertion-at-an-index is already an undesirable operation
>>>>>>>>>>>>>>>> on ArrayList due
>>>>>>>>>>>>>>>> > to its linear time complexity.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Kevin
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > 1000000 elements:
>>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.30
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.80
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.52
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.81
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.87
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.31
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > 100000 elements:
>>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.86
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.48
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.89
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.68
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > 10000 elements:
>>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.04
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.33
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.53
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.97
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.45
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.52
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > 1000 elements:
>>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.99
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.27
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.54
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.84
>>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.23
>>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.11
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <
>>>>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> My feeling on whether to support O(1) at both ends
>>>>>>>>>>>>>>>> >> is that any flavor of this that ends up in the JDK
>>>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>>> >> should really do this. My idea is that we can
>>>>>>>>>>>>>>>> >> wholeheartedly recommend this collection class
>>>>>>>>>>>>>>>> >> for overall good behavior without any of the surprising
>>>>>>>>>>>>>>>> >> performance traps of existing collection classes.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> But for the preliminary version, it makes sense to
>>>>>>>>>>>>>>>> >> support only O(1) at one end, if it simplifies the
>>>>>>>>>>>>>>>> >> implementation. Random access will of course
>>>>>>>>>>>>>>>> >> be worse than ArrayList, but by how much?
>>>>>>>>>>>>>>>> >> We can do some benchmarking and look for
>>>>>>>>>>>>>>>> >> micro-optimizations now.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Kevin, what is you own personal feeling?
>>>>>>>>>>>>>>>> >> Is the algorithm correct, and efficient enough?
>>>>>>>>>>>>>>>> >> Do you think your new collection belongs in java.util?
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Martin
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <
>>>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>>>>>> >> > The data structure is available at the second link that
>>>>>>>>>>>>>>>> I originally
>>>>>>>>>>>>>>>> >> > provided (once again, it is
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
>>>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>>>> >> > This does not have O(1) time insertion at the front as
>>>>>>>>>>>>>>>> yet as it was
>>>>>>>>>>>>>>>> >> > unclear
>>>>>>>>>>>>>>>> >> > to me whether or not it was agreed upon:
>>>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>>>> >> > From: Osvaldo Doederlein <opinali at gmail.com>
>>>>>>>>>>>>>>>> >> > Date: Mon, Mar 29, 2010 at 10:08 AM
>>>>>>>>>>>>>>>> >> > Subject: Re: A List implementation backed by multiple
>>>>>>>>>>>>>>>> small arrays
>>>>>>>>>>>>>>>> >> > rather
>>>>>>>>>>>>>>>> >> > than the traditional single large array.
>>>>>>>>>>>>>>>> >> > To: Martin Buchholz <martinrb at google.com>
>>>>>>>>>>>>>>>> >> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
>>>>>>>>>>>>>>>> >> > core-libs-dev at openjdk.java.net
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Initially, it would be good enough to replace only
>>>>>>>>>>>>>>>> java.util.ArrayList
>>>>>>>>>>>>>>>> >> > with
>>>>>>>>>>>>>>>> >> > minimal overhead. ArrayList does not support efficient
>>>>>>>>>>>>>>>> add-at-front or
>>>>>>>>>>>>>>>> >> > other
>>>>>>>>>>>>>>>> >> > enhancements of ArrayDeque; but ArrayList is still a
>>>>>>>>>>>>>>>> much more important
>>>>>>>>>>>>>>>> >> > and
>>>>>>>>>>>>>>>> >> > popular collection, it's the primary "straight
>>>>>>>>>>>>>>>> replacement for primitive
>>>>>>>>>>>>>>>> >> > arrrays" and I guess it should continue with that role.
>>>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > As a disclaimer, I'm still tinkering with this so I'll
>>>>>>>>>>>>>>>> be updating the
>>>>>>>>>>>>>>>> >> > document at the provided link as I find improvements.
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Thoughts?
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Kevin
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <
>>>>>>>>>>>>>>>> martinrb at google.com>
>>>>>>>>>>>>>>>> >> > wrote:
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> Hi Kevin,
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> You're probably the only one on this list who has
>>>>>>>>>>>>>>>> >> >> seriously read the paper. It is not surprising that
>>>>>>>>>>>>>>>> >> >> taking a research paper into production would
>>>>>>>>>>>>>>>> >> >> discover bugs - the research never had to undergo
>>>>>>>>>>>>>>>> >> >> rigorous testing. (I like the Java culture of
>>>>>>>>>>>>>>>> >> >> combining spec + implementation + test suite)
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> I suggest you ask the authors directly about the bug.
>>>>>>>>>>>>>>>> >> >> They would probably also be interested to hear
>>>>>>>>>>>>>>>> >> >> about your implementation.
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> Are you aware of Integer.numberOfLeadingZeros?
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> Martin
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <
>>>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>>> >> >> wrote:
>>>>>>>>>>>>>>>> >> >> > I'm almost convinced now that the paper is
>>>>>>>>>>>>>>>> incorrect. The code below
>>>>>>>>>>>>>>>> >> >> > gives
>>>>>>>>>>>>>>>> >> >> > me the appropriate index into the index array and
>>>>>>>>>>>>>>>> the offset into the
>>>>>>>>>>>>>>>> >> >> > data
>>>>>>>>>>>>>>>> >> >> > block. That being said, remember when I mentioned
>>>>>>>>>>>>>>>> that this will
>>>>>>>>>>>>>>>> >> >> > include a
>>>>>>>>>>>>>>>> >> >> > bit more work to access an element than a simple bit
>>>>>>>>>>>>>>>> shift and a bit
>>>>>>>>>>>>>>>> >> >> > mask?
>>>>>>>>>>>>>>>> >> >> > Well this is more than a bit more - we'll be doing
>>>>>>>>>>>>>>>> this each time an
>>>>>>>>>>>>>>>> >> >> > index
>>>>>>>>>>>>>>>> >> >> > is requested. I'll spend some time trying to
>>>>>>>>>>>>>>>> twiddle the bits to see
>>>>>>>>>>>>>>>> >> >> > if
>>>>>>>>>>>>>>>> >> >> > I
>>>>>>>>>>>>>>>> >> >> > can eliminate/combine some of the operations.
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >> > for (int r = 1; r < 33; r++) {
>>>>>>>>>>>>>>>> >> >> > int k = lg(r);
>>>>>>>>>>>>>>>> >> >> > int floorKO2 = k >> 1;
>>>>>>>>>>>>>>>> >> >> > int powFloorKO2 = (1 << floorKO2);
>>>>>>>>>>>>>>>> >> >> > int p = ((1 << floorKO2) - 1) << 1;
>>>>>>>>>>>>>>>> >> >> > int ceilKO2;
>>>>>>>>>>>>>>>> >> >> > if ((k & 1) == 1) {
>>>>>>>>>>>>>>>> >> >> > ceilKO2 = floorKO2 + 1;
>>>>>>>>>>>>>>>> >> >> > p += powFloorKO2;
>>>>>>>>>>>>>>>> >> >> > } else {
>>>>>>>>>>>>>>>> >> >> > ceilKO2 = floorKO2;
>>>>>>>>>>>>>>>> >> >> > }
>>>>>>>>>>>>>>>> >> >> > int e = r & ((1 << ceilKO2) - 1);
>>>>>>>>>>>>>>>> >> >> > int b = (r >> ceilKO2) & (powFloorKO2 -
>>>>>>>>>>>>>>>> 1);
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >> > System.out.println((r - 1) + " " + (p +
>>>>>>>>>>>>>>>> b) + " " + e);
>>>>>>>>>>>>>>>> >> >> > }
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >> > Kevin
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
>>>>>>>>>>>>>>>> >> >> > <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>>> >> >> > wrote:
>>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>>> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for
>>>>>>>>>>>>>>>> even numbered
>>>>>>>>>>>>>>>> >> >> >> superblocks,
>>>>>>>>>>>>>>>> >> >> >> the odd numbered superblocks need an additional
>>>>>>>>>>>>>>>> term added (the
>>>>>>>>>>>>>>>> >> >> >> number
>>>>>>>>>>>>>>>> >> >> >> of
>>>>>>>>>>>>>>>> >> >> >> data blocks in SB_[k-1]) to jive with my
>>>>>>>>>>>>>>>> interpretation; anyhow, I
>>>>>>>>>>>>>>>> >> >> >> also
>>>>>>>>>>>>>>>> >> >> >> came
>>>>>>>>>>>>>>>> >> >> >> across an alternative characterization of
>>>>>>>>>>>>>>>> superblock in the paper
>>>>>>>>>>>>>>>> >> >> >> which
>>>>>>>>>>>>>>>> >> >> >> states that data blocks are grouped within a
>>>>>>>>>>>>>>>> superblock when they
>>>>>>>>>>>>>>>> >> >> >> are
>>>>>>>>>>>>>>>> >> >> >> the
>>>>>>>>>>>>>>>> >> >> >> same size - to me, though, that implies that my
>>>>>>>>>>>>>>>> example structure
>>>>>>>>>>>>>>>> >> >> >> below
>>>>>>>>>>>>>>>> >> >> >> would be
>>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>>> >> >> >> SB_0: [1]
>>>>>>>>>>>>>>>> >> >> >> SB_1: [2][2][2]
>>>>>>>>>>>>>>>> >> >> >> SB_2: [4][4][4][4][4][4]
>>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>>> >> >> >> which seems to contradict my understanding of (1)
>>>>>>>>>>>>>>>> below. I must be
>>>>>>>>>>>>>>>> >> >> >> reading this upside down.
>>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>>> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
>>>>>>>>>>>>>>>> >> >> >> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>>> >> >> >> wrote:
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> What am I missing here? In "Resizable arrays in
>>>>>>>>>>>>>>>> optimal time and
>>>>>>>>>>>>>>>> >> >> >>> space"
>>>>>>>>>>>>>>>> >> >> >>> the authors define their data structure with the
>>>>>>>>>>>>>>>> following
>>>>>>>>>>>>>>>> >> >> >>> property:
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> (1) "When superblock SB_k is fully allocated, it
>>>>>>>>>>>>>>>> consists of
>>>>>>>>>>>>>>>> >> >> >>> 2^(floor(k/2)) data blocks, each of size
>>>>>>>>>>>>>>>> 2^(ceil(k/2))."
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> Since the superblock is zero-based indexed this
>>>>>>>>>>>>>>>> implies the
>>>>>>>>>>>>>>>> >> >> >>> following
>>>>>>>>>>>>>>>> >> >> >>> structure:
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> SB_0: [1]
>>>>>>>>>>>>>>>> >> >> >>> SB_1: [2]
>>>>>>>>>>>>>>>> >> >> >>> SB_2: [2][2]
>>>>>>>>>>>>>>>> >> >> >>> SB_3: [4][4]
>>>>>>>>>>>>>>>> >> >> >>> SB_4: [4][4][4][4]
>>>>>>>>>>>>>>>> >> >> >>> [...]
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with
>>>>>>>>>>>>>>>> i = 3:
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> r = 100 (the binary expansion of i + 1)
>>>>>>>>>>>>>>>> >> >> >>> k = |r| - 1 = 2
>>>>>>>>>>>>>>>> >> >> >>> p = 2^k - 1 = 3
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> What concerns me is their statement that p
>>>>>>>>>>>>>>>> represents "the number
>>>>>>>>>>>>>>>> >> >> >>> of
>>>>>>>>>>>>>>>> >> >> >>> data
>>>>>>>>>>>>>>>> >> >> >>> blocks in superblocks prior to SB_k." There are
>>>>>>>>>>>>>>>> only two data
>>>>>>>>>>>>>>>> >> >> >>> blocks
>>>>>>>>>>>>>>>> >> >> >>> in
>>>>>>>>>>>>>>>> >> >> >>> superblocks prior to SB_2, not three. Given (1)
>>>>>>>>>>>>>>>> above, unless I'm
>>>>>>>>>>>>>>>> >> >> >>> misinterpreting it, the number of data blocks in
>>>>>>>>>>>>>>>> superblocks prior
>>>>>>>>>>>>>>>> >> >> >>> to
>>>>>>>>>>>>>>>> >> >> >>> SB_k
>>>>>>>>>>>>>>>> >> >> >>> should be:
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> This, of course, seems to work out much better in
>>>>>>>>>>>>>>>> my example above,
>>>>>>>>>>>>>>>> >> >> >>> giving the correct answer to my interpretation of
>>>>>>>>>>>>>>>> their data
>>>>>>>>>>>>>>>> >> >> >>> structure, but
>>>>>>>>>>>>>>>> >> >> >>> I have a hard time believing that this is their
>>>>>>>>>>>>>>>> mistake rather than
>>>>>>>>>>>>>>>> >> >> >>> my
>>>>>>>>>>>>>>>> >> >> >>> misinterpretation.
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> Thoughts?
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> Kevin
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz
>>>>>>>>>>>>>>>> >> >> >>> <martinrb at google.com>
>>>>>>>>>>>>>>>> >> >> >>> wrote:
>>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>>> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
>>>>>>>>>>>>>>>> >> >> >>>> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>>> >> >> >>>> wrote:
>>>>>>>>>>>>>>>> >> >> >>>> > Hi Martin,
>>>>>>>>>>>>>>>> >> >> >>>> >
>>>>>>>>>>>>>>>> >> >> >>>> > Thanks much for your feedback. The first
>>>>>>>>>>>>>>>> approach that comes to
>>>>>>>>>>>>>>>> >> >> >>>> > mind
>>>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>>>> >> >> >>>> > implement O(1) time front as well as rear
>>>>>>>>>>>>>>>> insertion is to create
>>>>>>>>>>>>>>>> >> >> >>>> > a
>>>>>>>>>>>>>>>> >> >> >>>> > cyclic
>>>>>>>>>>>>>>>> >> >> >>>> > list structure with a front/rear pointer - to
>>>>>>>>>>>>>>>> insert at the
>>>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>>>> >> >> >>>> > requires
>>>>>>>>>>>>>>>> >> >> >>>> > decrementing the front pointer (modulo the
>>>>>>>>>>>>>>>> size) and to insert
>>>>>>>>>>>>>>>> >> >> >>>> > at
>>>>>>>>>>>>>>>> >> >> >>>> > the
>>>>>>>>>>>>>>>> >> >> >>>> > rear
>>>>>>>>>>>>>>>> >> >> >>>> > requires incrementing the rear pointer (modulo
>>>>>>>>>>>>>>>> the size). We
>>>>>>>>>>>>>>>> >> >> >>>> > need
>>>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>>>> >> >> >>>> > resize
>>>>>>>>>>>>>>>> >> >> >>>> > when the two pointers bump into each other.
>>>>>>>>>>>>>>>> Could you explain
>>>>>>>>>>>>>>>> >> >> >>>> > more
>>>>>>>>>>>>>>>> >> >> >>>> > about
>>>>>>>>>>>>>>>> >> >> >>>> > your suggestion of introducing an arraylet that
>>>>>>>>>>>>>>>> is shared by the
>>>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>>>> >> >> >>>> > and
>>>>>>>>>>>>>>>> >> >> >>>> > the rear?
>>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>>> >> >> >>>> It was a half-baked idea - I don't know if
>>>>>>>>>>>>>>>> there's a way to turn
>>>>>>>>>>>>>>>> >> >> >>>> it
>>>>>>>>>>>>>>>> >> >> >>>> into
>>>>>>>>>>>>>>>> >> >> >>>> something useful. I was thinking of the
>>>>>>>>>>>>>>>> ArrayDeque
>>>>>>>>>>>>>>>> >> >> >>>> implementation,
>>>>>>>>>>>>>>>> >> >> >>>> where all the elements live in a single array.
>>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>>> >> >> >>>> > It's not clear to me how that would help
>>>>>>>>>>>>>>>> and/or be a better
>>>>>>>>>>>>>>>> >> >> >>>> > approach than the cyclic list. Anyhow, the
>>>>>>>>>>>>>>>> paper that you
>>>>>>>>>>>>>>>> >> >> >>>> > reference,
>>>>>>>>>>>>>>>> >> >> >>>> > "Resizable arrays in optimal time and space",
>>>>>>>>>>>>>>>> gives a deque so
>>>>>>>>>>>>>>>> >> >> >>>> > if
>>>>>>>>>>>>>>>> >> >> >>>> > we
>>>>>>>>>>>>>>>> >> >> >>>> > take
>>>>>>>>>>>>>>>> >> >> >>>> > that approach then the deque is specified.
>>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>>> >> >> >>>> Technically, ArrayList also supports the Deque
>>>>>>>>>>>>>>>> operations -
>>>>>>>>>>>>>>>> >> >> >>>> just not efficiently.
>>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100424/e6eaec4c/attachment.html>
More information about the core-libs-dev
mailing list