A List implementation backed by multiple small arrays rather than the traditional single large array.
Benedict Elliott Smith
lists at laerad.com
Fri Apr 23 19:26:09 UTC 2010
Hi Kevin,
Unfortunately this week has been pretty hectic, and I haven't had much time
to much more than theorise on this topic - and this weekend the weather
looks set to be much too nice to stay in doors! It looks like you've made
really good progress on the ChunkedArrayDeque - I haven't fully digested it
yet, but it looks pretty impressive. I'm not sure (since I misunderstood
your list implementation prior to reading it in detail) but I think I may
have been toying with a similar growth strategy for a hash map variant since
the copies are necessary there anyway.
I have taken another look at the index algorithm and have tidied it up as
below, and it now supports a seed size of 1; however I am not sure that
using a value this small is advisable, given that the overhead for the first
array is at least 60 bytes; with a seed size of 1 the first 8 indexes would
utilise less than 20% of the allocated memory for data, the first 24 less
than 25%. I would have thought a seed of at least 2 and perhaps 3 would be
advisable.
As an aside, it is worth noting that the indexOffset calculation below can
be replaced with index & (_backingArray[arrayIndex].length - 1) , although I
am not certain what effect this will have on performance, given that this
would be another memory operation to retrieve the length from the array; but
it would make the separation of the calculation into functions more straight
forward.
final int arraySizeShiftMinusSeed = (31 -
Integer.numberOfLeadingZeros(index >>> seed)) >>> 1 ;
final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
final int firstArrayOfThisSize =
(1 << arraySizeShiftMinusSeed + 2)
- (1 << arraySizeShiftMinusSeed)
- 1 - (arraySizeShift >>> 31) ;
final int indexRemainder = index - (1 << arraySizeShift <<
arraySizeShiftMinusSeed) ;
final int arrayOffset = indexRemainder >>> arraySizeShift ;
final int arrayIndex = (firstArrayOfThisSize + arrayOffset) &
Integer.MAX_VALUE ;
final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
On 23 April 2010 11:00, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> Hi Benedict,
>
> Have you had a chance to get your index decomposition procedure to work
> with seed values less than two?
>
> Kevin
>
>
> On Sat, Apr 17, 2010 at 11:48 AM, Benedict Elliott Smith <lists at laerad.com
> > wrote:
>
>> Hi Kevin,
>>
>> As it happens I might have something useful still to contribute. As an
>> exercise in saving face I revisited the problem to see if I could achieve
>> the same complexity bounds as ChunkedArrayList but with a lower overhead. I
>> must admit I still didn't fully appreciate how the algorithm in
>> ChunkedArrayList worked until I tried to come up with an algorithm with
>> similar properties. What I have ended up with is almost identical except
>> adds I think a couple of incremental improvements, simply by redefining the
>> arrayIndex() method. I should note that I have not yet implemented more than
>> a prototype as it seems to me your implementation is excellent already, and
>> if it is decided to include my modifications the changes should be modest.
>>
>> Firstly, (I hope that) what I have produced is a little more CPU pipe-line
>> friendly; there is less dependency on immediately preceding calculations at
>> each stage (i.e. so more operations should be able to proceed simultaneously
>> in the pipeline), and consists exclusively of shifts, addition/subtraction
>> and bit-wise (&)ands (except for the conditionals in
>> Integer.numberOfLeadingZeros(i)), although the total number of instructions
>> is approximately the same.
>>
>> Secondly, I have modified the algorithm so that a "seed" size can be
>> specified (although I expect hard coding a suitable one will ultimately be
>> best). Whereas ChunkedArrayList currently requires that the pattern of array
>> allocation sizes be [1, 1, 2, 2, 2, 4(..*6), 8(..*12), 16(..*24)] we can now
>> support, for some "*s*", [*s*(..*2), 2*s*(..*3), 4*s*(..*6), 8*s*(..*12),
>> 16*s*(..*24)] etc. although when put in simple text like that it does
>> appear to trivialise the change. The benefit of this, though, is two fold:
>> 1) for small n the constant factor is reduced (both CPU and memory wise);
>> and 2) the sqrt(n) bounds are reached more quickly also.
>>
>> As an illustration, consider setting *s* to 4, and assume the backing
>> array is size two and doubles in size with each growth; with
>> ChunkedArrayList we would resize at i=2, i=6, i=20, i=72; with *s* as 4
>> we would instead resize at i=8,i=24,i=80,i=288; the cost at each would be
>> some multiple of 2,4,8,16 respectively. As you can see the latter is much
>> closer to the sqrt(n) cost - both approach it eventually, but my suggestion
>> is to reach it more quickly. This is at the expense of more slowly reaching
>> the sqrt(n) wasted memory condition, but given the high constant factor cost
>> wrt to memory at this early stage, this seems a very sensible trade off. It
>> seems likely this should also have a positive impact on cache performance
>> for smaller lists as well.
>>
>> Finally, after playing with this idea in my head I am confident I can
>> extend the core ideas of this data structure to hashing relatively easily,
>> getting the the same worst case O(sqrt(n)) insertion cost, and O(sqrt(n))
>> wasted memory guarantees. I notice that this case hasn't been addressed yet,
>> although I see from Martin's recent mail that this was raised before. Unless
>> there are better suggestions for solving the hash table problem I will have
>> a go at it as it seems an interesting problem - that is, assuming there are
>> no objections?
>>
>> I'm interested to hear your thoughts. I hope this time I've been a bit
>> more considered in what I've put forward, and hence less of a waste of time!
>>
>> Code snippet for calculation of array index and item offset:
>>
>> final int arraySizeShiftMinusSeed = ((31 -
>> Integer.numberOfLeadingZeros(index >>> seed)) >>> 1) ;
>> final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
>> final int firstArrayOfThisSize = ((((1 << arraySizeShiftMinusSeed + 3) -
>> (1 << arraySizeShiftMinusSeed + 1))) >>> 1) - 1 ;
>> final int indexRemainder = index - ((1 << seed) <<
>> arraySizeShiftMinusSeed + arraySizeShiftMinusSeed) ;
>> final int arrayOffset = indexRemainder >>> arraySizeShift ;
>>
>> final int arrayIndex = firstArrayOfThisSize + arrayOffset ;
>> final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
>>
>> the first array size will be 1 << seed - 1 (i.e. seed is equal to *s* +
>> 1); seed only works for values for 2 or more at this moment, fyi
>>
>>
>>
>> On 16 April 2010 00:18, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>
>>> Oh no worries Benedict, thanks for your interest in the topic. Let me
>>> know if you have any other questions or if you have any related ideas or
>>> concerns.
>>>
>>>
>>> On Thu, Apr 15, 2010 at 8:00 AM, Benedict Elliott Smith <
>>> lists at laerad.com> wrote:
>>>
>>>> Sorry Kevin - it sounds like I might be being of more hindrance than
>>>> help. that part of the discussion was clearly truncated by the time I had
>>>> joined the list - I haven't been able to find the history in the archives
>>>> either...
>>>>
>>>> I was just wondering about the worst case cost of add() as described by
>>>> your javadoc; admittedly it is optimal with respect to unused memory, but
>>>> the worst case cost of an add is still sqrt(n), with a relatively high
>>>> constant factor. I had been thinking that once n passed a threshold the cost
>>>> of additions in this other structure would become a constant factor,
>>>> offering nice algorithmic complexity guarantees for large n; however since
>>>> sqrt(Integer.MAX_VALUE) is ~46,000, the maximum size of new array
>>>> allocations would have to be unrealistically small (assuming linear cost for
>>>> allocation) for this to be the case. It would still be nice to have a data
>>>> structure that avoids needing to copy data with each grow, whilst still
>>>> maintaining good memory performance.
>>>>
>>>> That *all* being said, I had been going by your javadoc and emails to
>>>> ascertain the behaviour of this class, as I couldn't locate a free copy
>>>> of [Brodnik99resizablearrays], and it seems this was a bad idea; as the
>>>> sqrt(n) cost appears to be associated with growing the backing array, rather
>>>> than with what I assumed to be copying data between arraylets, and it seems
>>>> this cost is pretty optimal. That will teach me to post to a list without
>>>> getting my facts straight first. The interesting thing is simply that the
>>>> constant factor for this implementation still seems to be quite high,
>>>> although perhaps that is simply because I was not benchmarking sufficiently
>>>> large values of n.
>>>>
>>>>
>>>>
>>>> On 15 April 2010 12:12, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>>>
>>>>> Hi Benedict,
>>>>>
>>>>> Unless I am misreading your post, this now has a very similar feel to
>>>>> the first data structure that I posted to the list. Martin Buchholz then
>>>>> pointed out that we can incorporate the ideas from
>>>>> [Brodnik99resizablearrays] and reap additional benefits.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>> On Thu, Apr 15, 2010 at 4:07 AM, Benedict Elliott Smith <
>>>>> lists at laerad.com> wrote:
>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> Yes, as I was going to bed last night I realised I had not fully
>>>>>> addressed the problem that was originally being visited; only reduced the
>>>>>> constant factor for addition to the end of the list. A trivial modification
>>>>>> fixes that, however; same scheme but up to some maximum arraylet size (of a
>>>>>> power of 2), after which the array is increased in size linearly.
>>>>>> Performance doesn't seem to have been affected appreciably, although not
>>>>>> been exhaustive in the benchmarking:
>>>>>>
>>>>>> 10 items inserts versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>> 10 items inserts Chunked / ExpArray = 0.99
>>>>>> 10 items get versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>> 10 items get Chunked / ExpArray = 0.99
>>>>>> 100 items inserts versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>> 100 items inserts Chunked / ExpArray = 1.23
>>>>>> 100 items get versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>> 100 items get Chunked / ExpArray = 1.23
>>>>>> 1000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>> 1000 items inserts Chunked / ExpArray = 1.19
>>>>>> 1000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>> 1000 items get Chunked / ExpArray = 1.19
>>>>>> 10000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>> 10000 items inserts Chunked / ExpArray = 1.18
>>>>>> 10000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>> 10000 items get Chunked / ExpArray = 1.18
>>>>>> 100000 items inserts versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>> 100000 items inserts Chunked / ExpArray = 1.09
>>>>>> 100000 items get versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>> 100000 items get Chunked / ExpArray = 1.09
>>>>>>
>>>>>> The nice thing about this is that the maximum amount of wasted memory
>>>>>> is user configurable. Even with a low setting as above (65K) performance
>>>>>> seems pretty consistent.
>>>>>>
>>>>>> Code for calculating index and array offset are pretty straight
>>>>>> forward; haven't given much thought to optimisations just yet:
>>>>>>
>>>>>> private final int indexFor(int a, int i) {
>>>>>> return 1 + i - (a > maxArrayIndex ? (1 + a - maxArrayIndex) <<
>>>>>> maxArraySizeShift : 1 << a) ;
>>>>>> }
>>>>>> private final int arrayFor(int i) {
>>>>>> return i >= (maxArraySize << 1) ? (i + 1 >>> maxArraySizeShift) +
>>>>>> maxArrayIndex - 1 : 31 - Integer.numberOfLeadingZeros(i + 1) ;
>>>>>> }
>>>>>>
>>>>>> Regarding the double list idea - yes, I agree, I certainly didn't
>>>>>> think that one through fully!
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 15 April 2010 02:44, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>>>>>>
>>>>>>> Hi Benedict,
>>>>>>>
>>>>>>> Like you, I am relatively new to this mailing list; I am also trying
>>>>>>> to tread lightly so as not to step on any toes. That being said, I think
>>>>>>> that I can offer a response to your inquiry.
>>>>>>>
>>>>>>> Regarding: "The idea is to simply double the new array size each
>>>>>>> time a new array needs to be allocated"
>>>>>>>
>>>>>>> It seems this would not address the desire of offering an alternative
>>>>>>> to the allocation of a large backing array for ArrayList (your largest
>>>>>>> backing array could still reach a size of 1/2 * Integer.MAX_VALUE) and would
>>>>>>> not address the desire of wasting the (asymptotically) minimum amount of
>>>>>>> memory in the worst case while maintaining O(1) amortized time bounds. The
>>>>>>> data structure described in [Brodnik99resizablearrays] has a maximum backing
>>>>>>> array size of sqrt(n) and caps wasted memory at sqrt(n). What advantage
>>>>>>> over ArrayList do you see in your data structure?
>>>>>>>
>>>>>>> Regarding: "Also, with regard to a Deque implementation, it seems
>>>>>>> that the simplest solution would be to simply have two lists, with one
>>>>>>> accepting inserts for near the beginning and being ordered in reverse whilst
>>>>>>> the other accepted inserts for near to the end."
>>>>>>>
>>>>>>> What happens with your structure when you add n elements and then
>>>>>>> remove element 0 n times? I think that once you work out all the kinks
>>>>>>> you'll end up with the two stacks approach, which is mentioned in
>>>>>>> [Brodnik99resizablearrays] and which I mentioned in an earlier email, or
>>>>>>> you'll end up with the circular list approach, which is not friendly to O(1)
>>>>>>> amortized time bounds in a data structure that resizes more often than O(n)
>>>>>>> due to the 'unshift' to the front = 0 position. I think the best approach
>>>>>>> is the one mentioned in [Brodnik99resizablearrays], which is the approach
>>>>>>> that I am currently working on. Incidentally, this approach also provides
>>>>>>> for a much improved index unpacking procedure using only bit shifts and bit
>>>>>>> masks, although it is at the expense of (O(1)) additional work during
>>>>>>> resize.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 14, 2010 at 4:42 PM, Benedict Elliott Smith <
>>>>>>> lists at laerad.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I hope you don't consider it rude to involve myself in this
>>>>>>>> conversation towards the end - I joined the mailing list only recently.
>>>>>>>>
>>>>>>>> I'm not sure if this offers a huge amount to the discussion, but I
>>>>>>>> have tinkered with a "chunked" array list which seems to offer better time
>>>>>>>> performance in general at the cost of greater (worst case) memory
>>>>>>>> utilisation. It is easier to understand IMHO as well, although this is not
>>>>>>>> necessarily a great benefit here. It turns out the idea is very similar to
>>>>>>>> the one implemented already by Kevin, though; but perhaps simpler. The idea
>>>>>>>> is to simply double the new array size each time a new array needs to be
>>>>>>>> allocated, or in effect allocate an array that is the size of all existing
>>>>>>>> arrays put together. With this scheme the calculation for array and offset
>>>>>>>> are really very straight forward ( floor(log(i)) and 1 + i -
>>>>>>>> 2^floor(log(i))) ). Memory utilisation is the same as for ArrayList, but
>>>>>>>> obviously inserts at the end are much quicker.
>>>>>>>>
>>>>>>>> I have prototyped the data structure this evening and benchmarked
>>>>>>>> additions at the end of the list, for which the performance is pretty
>>>>>>>> impressive.
>>>>>>>>
>>>>>>>> Some random statistics for addition only on the client JVM (I have
>>>>>>>> quickly dubbed my implementation ExpArrayList)
>>>>>>>> All statistics were run in two rounds with ~1000 runs per round per
>>>>>>>> statistic per list, and the second round results were used.
>>>>>>>>
>>>>>>>> 10 items versus ArrayList: Chunked=1.14, ExpArray=1.02
>>>>>>>> 10 items Chunked / ExpArray = 1.12
>>>>>>>> 100 items versus ArrayList: Chunked=1.20, ExpArray=0.82
>>>>>>>> 100 items Chunked / ExpArray = 1.45
>>>>>>>> 1000 items versus ArrayList: Chunked=1.03, ExpArray=0.51
>>>>>>>> 1000 items Chunked / ExpArray = 2.02
>>>>>>>> 10000 items versus ArrayList: Chunked=0.88, ExpArray=0.49
>>>>>>>> 10000 items Chunked / ExpArray = 1.79
>>>>>>>> 100000 items versus ArrayList: Chunked=0.32, ExpArray=0.20
>>>>>>>> 100000 items Chunked / ExpArray = 1.64
>>>>>>>>
>>>>>>>> and server JVM:
>>>>>>>> 10 items versus ArrayList: Chunked=1.00, ExpArray=1.16
>>>>>>>> 10 items Chunked / ExpArray = 0.86
>>>>>>>> 100 items versus ArrayList: Chunked=1.29, ExpArray=0.96
>>>>>>>> 100 items Chunked / ExpArray = 1.34
>>>>>>>> 1000 items versus ArrayList: Chunked=1.16, ExpArray=0.92
>>>>>>>> 1000 items Chunked / ExpArray = 1.27
>>>>>>>> 10000 items versus ArrayList: Chunked=0.93, ExpArray=0.84
>>>>>>>> 10000 items Chunked / ExpArray = 1.12
>>>>>>>> 100000 items versus ArrayList: Chunked=0.71, ExpArray=0.65
>>>>>>>> 100000 items Chunked / ExpArray = 1.10
>>>>>>>>
>>>>>>>> Interestingly insertion at the beginning of the list appears to be
>>>>>>>> quicker with ExpArrayList, at least on the server JVM, whereas I would have
>>>>>>>> expected them to be fairly close.
>>>>>>>> Amazingly ExpArrayList is faster even than ArrayList for insertion
>>>>>>>> at the beginning of large lists, which I haven't yet tried to understand.
>>>>>>>> Insertion in the middle is similar.
>>>>>>>>
>>>>>>>> 10 items versus ArrayList: Chunked=9.82, ExpArray=3.80
>>>>>>>> 10 items Chunked / ExpArray = 2.59
>>>>>>>> 100 items versus ArrayList: Chunked=7.30, ExpArray=3.41
>>>>>>>> 100 items Chunked / ExpArray = 2.14
>>>>>>>> 1000 items versus ArrayList: Chunked=2.83, ExpArray=1.09
>>>>>>>> 1000 items Chunked / ExpArray = 2.59
>>>>>>>> 10000 items versus ArrayList: Chunked=1.56, ExpArray=0.72
>>>>>>>> 10000 items Chunked / ExpArray = 2.16
>>>>>>>>
>>>>>>>> Finally, there are promising results for get() from the ExpArrayList
>>>>>>>> as well (server JVM), again somehow beating ArrayList for larger lists:
>>>>>>>> 10 items get versus ArrayList: Chunked=1.27, ExpArray=1.16
>>>>>>>> 10 items get Chunked / ExpArray = 1.10
>>>>>>>> 100 items get versus ArrayList: Chunked=1.45, ExpArray=1.17
>>>>>>>> 100 items get Chunked / ExpArray = 1.25
>>>>>>>> 1000 items get versus ArrayList: Chunked=1.42, ExpArray=1.07
>>>>>>>> 1000 items get Chunked / ExpArray = 1.33
>>>>>>>> 10000 items get versus ArrayList: Chunked=1.26, ExpArray=1.02
>>>>>>>> 10000 items get Chunked / ExpArray = 1.24
>>>>>>>> 100000 items get versus ArrayList: Chunked=1.05, ExpArray=0.86
>>>>>>>> 100000 items get Chunked / ExpArray = 1.22
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm willing to explore this further but I'm not sure how desirable
>>>>>>>> that is, given that Kevin's data structure appears to perform pretty well
>>>>>>>> already wrt to CPU time, and better wrt to memory utilisation, and in effect
>>>>>>>> this mostly changes only the function to determine which array to use, not
>>>>>>>> the body of the implementation. Let me know if you would like a copy of the
>>>>>>>> source code and I will find somewhere to upload it.
>>>>>>>>
>>>>>>>> Also, with regard to a Deque implementation, it seems that the
>>>>>>>> simplest solution would be to simply have two lists, with one accepting
>>>>>>>> inserts for near the beginning and being ordered in reverse whilst the other
>>>>>>>> accepted inserts for near to the end. The only trick would be having the
>>>>>>>> list at the beginning support iteration in reverse order cheaply, but this
>>>>>>>> could easily be achieved by creating an extension of List with a
>>>>>>>> reverseIterator() method.
>>>>>>>>
>>>>>>>>
>>>>>>>> Anyway, not sure if this helped at all but fancied joining in...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 14 April 2010 12:25, Joe Kearney <joe.j.kearney at googlemail.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Kevin,
>>>>>>>>>
>>>>>>>>> It implements List, as well as Deque. It is indeed based on
>>>>>>>>> ArrayDeque, with the added operations to implement list. It does so
>>>>>>>>> reasonably efficiently, moving the fewest elements possible on each
>>>>>>>>> operation, that is zero for the queue operations, at most n/2 for the rest
>>>>>>>>> and all of them for a backing array resize.
>>>>>>>>>
>>>>>>>>> The idea is to get a replacement for arraylist that performs like
>>>>>>>>> arraydeque on remove(0). As a side effect, we should be able to get better
>>>>>>>>> performance on other operations by requiring fewer elements to be moved.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Joe
>>>>>>>>>
>>>>>>>>> 2010/4/14 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>
>>>>>>>>> Hi Joe,
>>>>>>>>>>
>>>>>>>>>> I was referring to the ChunkedArrayList when I stated that add
>>>>>>>>>> does not amortize to constant time when the data structure employs the
>>>>>>>>>> circular list trick to achieve deque behavior; ChunkedArrayList potentially
>>>>>>>>>> resizes every n^(1/2) operations.
>>>>>>>>>>
>>>>>>>>>> Regarding your CircularArrayList, does it differ from Java's
>>>>>>>>>> ArrayDeque? I took only a cursory look at it, so please understand if I
>>>>>>>>>> have missed your reason for creating CircularArrayList altogether.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Kevin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 13, 2010 at 6:52 AM, Joe Kearney <
>>>>>>>>>> joe.j.kearney at googlemail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Kevin, Martin,
>>>>>>>>>>>
>>>>>>>>>>> To add another discussion point, I've been writing a
>>>>>>>>>>> draft/proof-of-concept of retrofitting the List interface onto ArrayDeque.
>>>>>>>>>>> This works over the raw array, it doesn't use the fancier structures being
>>>>>>>>>>> discussed elsewhere on this list that deal with splitting huge arrays into
>>>>>>>>>>> arraylets, or that provide for O(1) insert in the middle.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://code.google.com/p/libjoe/source/browse/trunk/src/joe/collect/CircularArrayList.java
>>>>>>>>>>>
>>>>>>>>>>> I'd be interested if you have any comments in the context of this
>>>>>>>>>>> discussion. The code is not entirely ready yet, a couple of tests fail
>>>>>>>>>>> (6/789) because of a corner case I haven't nailed yet, but the idea is there
>>>>>>>>>>> at least. I'd like to add array shrinking later, when the size dips below
>>>>>>>>>>> capacity*0.4 perhaps, to avoid flickering up and down around...
>>>>>>>>>>>
>>>>>>>>>>> Tests show performance to be close to ArrayList for the O(1)
>>>>>>>>>>> operations. Timings for indexed reads and writes showed
>>>>>>>>>>> no discernible difference between implementations last time I ran the
>>>>>>>>>>> tests. I don't understand at the moment why the iterator add at index
>>>>>>>>>>> size/3, size/2 perform 30% slower than ArrayList on smaller lists, nor the
>>>>>>>>>>> dodgy numbers for ArrayList.insert(5), I'll look at this soon. Those
>>>>>>>>>>> operations that become O(1) in a circular implementation (that are
>>>>>>>>>>> implemented and tested here) are faster than in ArrayList. Insert/remove in
>>>>>>>>>>> the middle are somewhat faster than ArrayList because we only have to copy
>>>>>>>>>>> at most half of the elements, except when resizing the array.
>>>>>>>>>>>
>>>>>>>>>>> Kevin, I don't fully understand your point about not amortizing
>>>>>>>>>>> to O(1). Certainly that's true for insert not at head or tail. Otherwise
>>>>>>>>>>> this implementation only moves array elements to the front on an array
>>>>>>>>>>> resize operation which happens every O(ln n) operations at most, if we do
>>>>>>>>>>> lots of adds, maybe a little more if we add array shrinking too. This is
>>>>>>>>>>> the same as ArrayList. Are you just referring to the add-in-the-middle case?
>>>>>>>>>>>
>>>>>>>>>>> Some performance results below, code for these is in the
>>>>>>>>>>> repository above too. This was the second run, after a warmup.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Joe
>>>>>>>>>>>
>>>>>>>>>>> ------------------------------------------------
>>>>>>>>>>> CircularArrayList ------------------------------------------------
>>>>>>>>>>> size add get set iterAdd/3
>>>>>>>>>>> iterAdd/2 insert(5) removeRnd removeMid remove(0)
>>>>>>>>>>> 10 20 67 70 125
>>>>>>>>>>> 102 90 240 191 138
>>>>>>>>>>> 100 19 67 70 166
>>>>>>>>>>> 138 94 230 194 118
>>>>>>>>>>> 1000 28 64 67 681
>>>>>>>>>>> 538 91 324 382 119
>>>>>>>>>>> 10000 30 65 67 5884
>>>>>>>>>>> 4425 94 1296 2330 124
>>>>>>>>>>> ---------------------------------------------------- ArrayList
>>>>>>>>>>> ----------------------------------------------------
>>>>>>>>>>> size add get set iterAdd/3
>>>>>>>>>>> iterAdd/2 insert(5) removeRnd removeMid remove(0)
>>>>>>>>>>> 10 23 68 70 100
>>>>>>>>>>> 69 32913 162 130 105
>>>>>>>>>>> 100 20 67 70 129
>>>>>>>>>>> 104 21944 169 134 135
>>>>>>>>>>> 1000 29 63 67 651
>>>>>>>>>>> 506 9602 364 333 526
>>>>>>>>>>> 10000 30 63 66 5878
>>>>>>>>>>> 4414 9947 2312 2280 4437
>>>>>>>>>>>
>>>>>>>>>>> 2010/4/13 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>
>>>>>>>>>>>> I had intended to address your request for absolute O(1)
>>>>>>>>>>>> operations in the previous email. The approach to achieving this suggested
>>>>>>>>>>>> in [Brodnik99resizablearrays] is tantamount to making ArrayList operations
>>>>>>>>>>>> absolute O(1) by keeping around an array of size (3/2)*n and filling it with
>>>>>>>>>>>> a constant number of entries from the main array each time add is called.
>>>>>>>>>>>> Although this distributes the work done during a resize across the n
>>>>>>>>>>>> operations required to enter a resize-required state, it is at the expense
>>>>>>>>>>>> of additional memory usage and slower add operations. My thought is that
>>>>>>>>>>>> this would be a fine approach for a real-time application that requires hard
>>>>>>>>>>>> guarantees on performance but would be a liability in so many Java
>>>>>>>>>>>> applications that do not require these hard guarantees. I look forward to
>>>>>>>>>>>> hearing your thoughts on the matter, though.
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <
>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's interesting to note that the old circular list trick will
>>>>>>>>>>>>> not suffice to turn this data structure into a deque since we might be
>>>>>>>>>>>>> copying all n elements back to the front = 0 position every n^(1/2)
>>>>>>>>>>>>> operations (add wouldn't amortize to O(1)). We could use the old two stacks
>>>>>>>>>>>>> trick (push elements onto one stack, flip (the bottom) half (of) the
>>>>>>>>>>>>> elements to the 'other' stack when the 'other' stack becomes empty),
>>>>>>>>>>>>> mentioned in [Brodnik99resizablearrays], but I find this to be a bit CS
>>>>>>>>>>>>> 101. In [Brodnik99resizablearrays] the authors suggest a method for making
>>>>>>>>>>>>> all blocks roughly the same size, allowing us to expand/shrink capacity at
>>>>>>>>>>>>> the beginning or the end; this is the approach that I will take to create a
>>>>>>>>>>>>> deque.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The FAQ for the Sun Contributor Agreement Q3 (
>>>>>>>>>>>>> http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3)
>>>>>>>>>>>>> indicates that one should check with the project to determine where the SCA
>>>>>>>>>>>>> should be sent. Do you know where I would find this information?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>
>>>>>>>>>>>>> @MISC{Brodnik99resizablearrays,
>>>>>>>>>>>>> author = {Andrej Brodnik and Svante Carlsson and Erik D.
>>>>>>>>>>>>> Demaine and J. Ian Munro and Robert Sedgewick},
>>>>>>>>>>>>> title = {Resizable Arrays in Optimal Time and Space},
>>>>>>>>>>>>> year = {1999}
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <
>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for your continuing work on this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I like the test results, and agree with your analysis.
>>>>>>>>>>>>>> I'm especially happy that you're beating
>>>>>>>>>>>>>> ArrayList at some operations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd like to see O(1) addition at the beginning,
>>>>>>>>>>>>>> implement both List and Deque (I regret
>>>>>>>>>>>>>> our not having done this with ArrayDeque).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> An additional property that would be nice to
>>>>>>>>>>>>>> have (but don't try too hard)
>>>>>>>>>>>>>> is to provide some kind of real-time
>>>>>>>>>>>>>> guarantees on the cost of an individual operation,
>>>>>>>>>>>>>> not just amortized time. E.g. ArrayList.add
>>>>>>>>>>>>>> is worst-case O(n), making it unsuitable for use
>>>>>>>>>>>>>> in some real-time applications.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will help get your changes into the obvious
>>>>>>>>>>>>>> software distributions. I assume you're happy
>>>>>>>>>>>>>> with having this class included in any of
>>>>>>>>>>>>>> Doug Lea's jsr166, guava-libraries, or the JDK itself.
>>>>>>>>>>>>>> You should sign a Sun contributor agreement,
>>>>>>>>>>>>>> or whatever the Oracle equivalent is,
>>>>>>>>>>>>>> if you have not done so yet.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Doug Lea likes public domain,
>>>>>>>>>>>>>> guava-libraries likes the Apache license.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We should get various people a chance to give
>>>>>>>>>>>>>> a thumbs up on the design of this class -
>>>>>>>>>>>>>> Doug Lea, Josh Bloch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <
>>>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>>> > Hello Martin,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > I spent some time this weekend trying to bring out bugs in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> > implementation; I believe the latest version to be in decent
>>>>>>>>>>>>>> shape. I have
>>>>>>>>>>>>>> > also gathered some data on the performance of
>>>>>>>>>>>>>> ChunkedArrayList over
>>>>>>>>>>>>>> > ArrayList using the latest 1.6 JDK, which I've included
>>>>>>>>>>>>>> below (note that the
>>>>>>>>>>>>>> > numbers represent the time spent performing the specified
>>>>>>>>>>>>>> operation with
>>>>>>>>>>>>>> > ChunkedArrayList over the time spent with ArrayList, so 1.00
>>>>>>>>>>>>>> indicates
>>>>>>>>>>>>>> > equivalent performance, < 1.00 indicates that
>>>>>>>>>>>>>> ChunkedArrayList is less
>>>>>>>>>>>>>> > costly and > 1.00 indicates that ArrayList is less costly).
>>>>>>>>>>>>>> I've noticed
>>>>>>>>>>>>>> > relatively significant variability in a few of the numbers
>>>>>>>>>>>>>> when I switch
>>>>>>>>>>>>>> > hardware; though, these data do seem to represent rough
>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>> > expectations. For my test I generated x elements and then
>>>>>>>>>>>>>> timed the process
>>>>>>>>>>>>>> > of adding them to ArrayList/ChunkedArrayList, then I
>>>>>>>>>>>>>> performed a get
>>>>>>>>>>>>>> > operation on each for indices 0 through x-1 and finally I
>>>>>>>>>>>>>> used the iterator
>>>>>>>>>>>>>> > mechanism to retrieve the first through xth element (of
>>>>>>>>>>>>>> course, I performed
>>>>>>>>>>>>>> > each of these operations multiple times throwing away the
>>>>>>>>>>>>>> timing for the
>>>>>>>>>>>>>> > first few iterations to warm up the JVM).
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Regarding the question of whether or not this belongs in
>>>>>>>>>>>>>> java.util, I would
>>>>>>>>>>>>>> > suggest that if it is desirable from a GC point of view to
>>>>>>>>>>>>>> eliminate the
>>>>>>>>>>>>>> > large backing array from ArrayList then your suggestion of
>>>>>>>>>>>>>> achieving this by
>>>>>>>>>>>>>> > way of a data structure that is both time and space optimal
>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>> > particularly elegant solution as it not only guarantees that
>>>>>>>>>>>>>> no backing
>>>>>>>>>>>>>> > array will be larger than sqrt(n) elements but it also
>>>>>>>>>>>>>> provides dynamic
>>>>>>>>>>>>>> > shrinking behavior, has less maximum memory overhead than
>>>>>>>>>>>>>> ArrayList, and
>>>>>>>>>>>>>> > copies (asymptotically) fewer elements during a resize than
>>>>>>>>>>>>>> ArrayList. Of
>>>>>>>>>>>>>> > course, this data structure does not do everything better
>>>>>>>>>>>>>> than ArrayList; in
>>>>>>>>>>>>>> > particular, indexed access is more costly, due to the
>>>>>>>>>>>>>> required decomposition
>>>>>>>>>>>>>> > of the index into backing array index and offset and the
>>>>>>>>>>>>>> additional memory
>>>>>>>>>>>>>> > indirection, and insertion-at-an-index is more costly, due
>>>>>>>>>>>>>> to the multiple
>>>>>>>>>>>>>> > array copies necessary to complete the shift. That being
>>>>>>>>>>>>>> said, I think that
>>>>>>>>>>>>>> > the additional cost of indexed access is partially mitigated
>>>>>>>>>>>>>> by the
>>>>>>>>>>>>>> > availability of iterator and listIterator, whose
>>>>>>>>>>>>>> implementations do not use
>>>>>>>>>>>>>> > the index decomposition procedure, and the additional cost
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> > insertion-at-an-index is partially mitigated by the fact
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> > insertion-at-an-index is already an undesirable operation on
>>>>>>>>>>>>>> ArrayList due
>>>>>>>>>>>>>> > to its linear time complexity.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Kevin
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 1000000 elements:
>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.30
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.80
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.52
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.81
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.87
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.31
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 100000 elements:
>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.86
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.48
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.89
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.68
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 10000 elements:
>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.04
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.33
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.53
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.97
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.45
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.52
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > 1000 elements:
>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.99
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.27
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.54
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.84
>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.23
>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.11
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <
>>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> My feeling on whether to support O(1) at both ends
>>>>>>>>>>>>>> >> is that any flavor of this that ends up in the JDK
>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>> >> should really do this. My idea is that we can
>>>>>>>>>>>>>> >> wholeheartedly recommend this collection class
>>>>>>>>>>>>>> >> for overall good behavior without any of the surprising
>>>>>>>>>>>>>> >> performance traps of existing collection classes.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> But for the preliminary version, it makes sense to
>>>>>>>>>>>>>> >> support only O(1) at one end, if it simplifies the
>>>>>>>>>>>>>> >> implementation. Random access will of course
>>>>>>>>>>>>>> >> be worse than ArrayList, but by how much?
>>>>>>>>>>>>>> >> We can do some benchmarking and look for
>>>>>>>>>>>>>> >> micro-optimizations now.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Kevin, what is you own personal feeling?
>>>>>>>>>>>>>> >> Is the algorithm correct, and efficient enough?
>>>>>>>>>>>>>> >> Do you think your new collection belongs in java.util?
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Martin
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <
>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>>>> >> > The data structure is available at the second link that I
>>>>>>>>>>>>>> originally
>>>>>>>>>>>>>> >> > provided (once again, it is
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>> >> > This does not have O(1) time insertion at the front as
>>>>>>>>>>>>>> yet as it was
>>>>>>>>>>>>>> >> > unclear
>>>>>>>>>>>>>> >> > to me whether or not it was agreed upon:
>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>> >> > From: Osvaldo Doederlein <opinali at gmail.com>
>>>>>>>>>>>>>> >> > Date: Mon, Mar 29, 2010 at 10:08 AM
>>>>>>>>>>>>>> >> > Subject: Re: A List implementation backed by multiple
>>>>>>>>>>>>>> small arrays
>>>>>>>>>>>>>> >> > rather
>>>>>>>>>>>>>> >> > than the traditional single large array.
>>>>>>>>>>>>>> >> > To: Martin Buchholz <martinrb at google.com>
>>>>>>>>>>>>>> >> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
>>>>>>>>>>>>>> >> > core-libs-dev at openjdk.java.net
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Initially, it would be good enough to replace only
>>>>>>>>>>>>>> java.util.ArrayList
>>>>>>>>>>>>>> >> > with
>>>>>>>>>>>>>> >> > minimal overhead. ArrayList does not support efficient
>>>>>>>>>>>>>> add-at-front or
>>>>>>>>>>>>>> >> > other
>>>>>>>>>>>>>> >> > enhancements of ArrayDeque; but ArrayList is still a much
>>>>>>>>>>>>>> more important
>>>>>>>>>>>>>> >> > and
>>>>>>>>>>>>>> >> > popular collection, it's the primary "straight
>>>>>>>>>>>>>> replacement for primitive
>>>>>>>>>>>>>> >> > arrrays" and I guess it should continue with that role.
>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > As a disclaimer, I'm still tinkering with this so I'll be
>>>>>>>>>>>>>> updating the
>>>>>>>>>>>>>> >> > document at the provided link as I find improvements.
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Thoughts?
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Kevin
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <
>>>>>>>>>>>>>> martinrb at google.com>
>>>>>>>>>>>>>> >> > wrote:
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Hi Kevin,
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> You're probably the only one on this list who has
>>>>>>>>>>>>>> >> >> seriously read the paper. It is not surprising that
>>>>>>>>>>>>>> >> >> taking a research paper into production would
>>>>>>>>>>>>>> >> >> discover bugs - the research never had to undergo
>>>>>>>>>>>>>> >> >> rigorous testing. (I like the Java culture of
>>>>>>>>>>>>>> >> >> combining spec + implementation + test suite)
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> I suggest you ask the authors directly about the bug.
>>>>>>>>>>>>>> >> >> They would probably also be interested to hear
>>>>>>>>>>>>>> >> >> about your implementation.
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Are you aware of Integer.numberOfLeadingZeros?
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Martin
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <
>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>> >> >> wrote:
>>>>>>>>>>>>>> >> >> > I'm almost convinced now that the paper is incorrect.
>>>>>>>>>>>>>> The code below
>>>>>>>>>>>>>> >> >> > gives
>>>>>>>>>>>>>> >> >> > me the appropriate index into the index array and the
>>>>>>>>>>>>>> offset into the
>>>>>>>>>>>>>> >> >> > data
>>>>>>>>>>>>>> >> >> > block. That being said, remember when I mentioned
>>>>>>>>>>>>>> that this will
>>>>>>>>>>>>>> >> >> > include a
>>>>>>>>>>>>>> >> >> > bit more work to access an element than a simple bit
>>>>>>>>>>>>>> shift and a bit
>>>>>>>>>>>>>> >> >> > mask?
>>>>>>>>>>>>>> >> >> > Well this is more than a bit more - we'll be doing
>>>>>>>>>>>>>> this each time an
>>>>>>>>>>>>>> >> >> > index
>>>>>>>>>>>>>> >> >> > is requested. I'll spend some time trying to twiddle
>>>>>>>>>>>>>> the bits to see
>>>>>>>>>>>>>> >> >> > if
>>>>>>>>>>>>>> >> >> > I
>>>>>>>>>>>>>> >> >> > can eliminate/combine some of the operations.
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >> > for (int r = 1; r < 33; r++) {
>>>>>>>>>>>>>> >> >> > int k = lg(r);
>>>>>>>>>>>>>> >> >> > int floorKO2 = k >> 1;
>>>>>>>>>>>>>> >> >> > int powFloorKO2 = (1 << floorKO2);
>>>>>>>>>>>>>> >> >> > int p = ((1 << floorKO2) - 1) << 1;
>>>>>>>>>>>>>> >> >> > int ceilKO2;
>>>>>>>>>>>>>> >> >> > if ((k & 1) == 1) {
>>>>>>>>>>>>>> >> >> > ceilKO2 = floorKO2 + 1;
>>>>>>>>>>>>>> >> >> > p += powFloorKO2;
>>>>>>>>>>>>>> >> >> > } else {
>>>>>>>>>>>>>> >> >> > ceilKO2 = floorKO2;
>>>>>>>>>>>>>> >> >> > }
>>>>>>>>>>>>>> >> >> > int e = r & ((1 << ceilKO2) - 1);
>>>>>>>>>>>>>> >> >> > int b = (r >> ceilKO2) & (powFloorKO2 -
>>>>>>>>>>>>>> 1);
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >> > System.out.println((r - 1) + " " + (p + b)
>>>>>>>>>>>>>> + " " + e);
>>>>>>>>>>>>>> >> >> > }
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >> > Kevin
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
>>>>>>>>>>>>>> >> >> > <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>> >> >> > wrote:
>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for even
>>>>>>>>>>>>>> numbered
>>>>>>>>>>>>>> >> >> >> superblocks,
>>>>>>>>>>>>>> >> >> >> the odd numbered superblocks need an additional term
>>>>>>>>>>>>>> added (the
>>>>>>>>>>>>>> >> >> >> number
>>>>>>>>>>>>>> >> >> >> of
>>>>>>>>>>>>>> >> >> >> data blocks in SB_[k-1]) to jive with my
>>>>>>>>>>>>>> interpretation; anyhow, I
>>>>>>>>>>>>>> >> >> >> also
>>>>>>>>>>>>>> >> >> >> came
>>>>>>>>>>>>>> >> >> >> across an alternative characterization of superblock
>>>>>>>>>>>>>> in the paper
>>>>>>>>>>>>>> >> >> >> which
>>>>>>>>>>>>>> >> >> >> states that data blocks are grouped within a
>>>>>>>>>>>>>> superblock when they
>>>>>>>>>>>>>> >> >> >> are
>>>>>>>>>>>>>> >> >> >> the
>>>>>>>>>>>>>> >> >> >> same size - to me, though, that implies that my
>>>>>>>>>>>>>> example structure
>>>>>>>>>>>>>> >> >> >> below
>>>>>>>>>>>>>> >> >> >> would be
>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>> >> >> >> SB_0: [1]
>>>>>>>>>>>>>> >> >> >> SB_1: [2][2][2]
>>>>>>>>>>>>>> >> >> >> SB_2: [4][4][4][4][4][4]
>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>> >> >> >> which seems to contradict my understanding of (1)
>>>>>>>>>>>>>> below. I must be
>>>>>>>>>>>>>> >> >> >> reading this upside down.
>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
>>>>>>>>>>>>>> >> >> >> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>> >> >> >> wrote:
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> What am I missing here? In "Resizable arrays in
>>>>>>>>>>>>>> optimal time and
>>>>>>>>>>>>>> >> >> >>> space"
>>>>>>>>>>>>>> >> >> >>> the authors define their data structure with the
>>>>>>>>>>>>>> following
>>>>>>>>>>>>>> >> >> >>> property:
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> (1) "When superblock SB_k is fully allocated, it
>>>>>>>>>>>>>> consists of
>>>>>>>>>>>>>> >> >> >>> 2^(floor(k/2)) data blocks, each of size
>>>>>>>>>>>>>> 2^(ceil(k/2))."
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> Since the superblock is zero-based indexed this
>>>>>>>>>>>>>> implies the
>>>>>>>>>>>>>> >> >> >>> following
>>>>>>>>>>>>>> >> >> >>> structure:
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> SB_0: [1]
>>>>>>>>>>>>>> >> >> >>> SB_1: [2]
>>>>>>>>>>>>>> >> >> >>> SB_2: [2][2]
>>>>>>>>>>>>>> >> >> >>> SB_3: [4][4]
>>>>>>>>>>>>>> >> >> >>> SB_4: [4][4][4][4]
>>>>>>>>>>>>>> >> >> >>> [...]
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with i
>>>>>>>>>>>>>> = 3:
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> r = 100 (the binary expansion of i + 1)
>>>>>>>>>>>>>> >> >> >>> k = |r| - 1 = 2
>>>>>>>>>>>>>> >> >> >>> p = 2^k - 1 = 3
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> What concerns me is their statement that p
>>>>>>>>>>>>>> represents "the number
>>>>>>>>>>>>>> >> >> >>> of
>>>>>>>>>>>>>> >> >> >>> data
>>>>>>>>>>>>>> >> >> >>> blocks in superblocks prior to SB_k." There are
>>>>>>>>>>>>>> only two data
>>>>>>>>>>>>>> >> >> >>> blocks
>>>>>>>>>>>>>> >> >> >>> in
>>>>>>>>>>>>>> >> >> >>> superblocks prior to SB_2, not three. Given (1)
>>>>>>>>>>>>>> above, unless I'm
>>>>>>>>>>>>>> >> >> >>> misinterpreting it, the number of data blocks in
>>>>>>>>>>>>>> superblocks prior
>>>>>>>>>>>>>> >> >> >>> to
>>>>>>>>>>>>>> >> >> >>> SB_k
>>>>>>>>>>>>>> >> >> >>> should be:
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> This, of course, seems to work out much better in my
>>>>>>>>>>>>>> example above,
>>>>>>>>>>>>>> >> >> >>> giving the correct answer to my interpretation of
>>>>>>>>>>>>>> their data
>>>>>>>>>>>>>> >> >> >>> structure, but
>>>>>>>>>>>>>> >> >> >>> I have a hard time believing that this is their
>>>>>>>>>>>>>> mistake rather than
>>>>>>>>>>>>>> >> >> >>> my
>>>>>>>>>>>>>> >> >> >>> misinterpretation.
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> Thoughts?
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> Kevin
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz
>>>>>>>>>>>>>> >> >> >>> <martinrb at google.com>
>>>>>>>>>>>>>> >> >> >>> wrote:
>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
>>>>>>>>>>>>>> >> >> >>>> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>> >> >> >>>> wrote:
>>>>>>>>>>>>>> >> >> >>>> > Hi Martin,
>>>>>>>>>>>>>> >> >> >>>> >
>>>>>>>>>>>>>> >> >> >>>> > Thanks much for your feedback. The first
>>>>>>>>>>>>>> approach that comes to
>>>>>>>>>>>>>> >> >> >>>> > mind
>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>> >> >> >>>> > implement O(1) time front as well as rear
>>>>>>>>>>>>>> insertion is to create
>>>>>>>>>>>>>> >> >> >>>> > a
>>>>>>>>>>>>>> >> >> >>>> > cyclic
>>>>>>>>>>>>>> >> >> >>>> > list structure with a front/rear pointer - to
>>>>>>>>>>>>>> insert at the
>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>> >> >> >>>> > requires
>>>>>>>>>>>>>> >> >> >>>> > decrementing the front pointer (modulo the size)
>>>>>>>>>>>>>> and to insert
>>>>>>>>>>>>>> >> >> >>>> > at
>>>>>>>>>>>>>> >> >> >>>> > the
>>>>>>>>>>>>>> >> >> >>>> > rear
>>>>>>>>>>>>>> >> >> >>>> > requires incrementing the rear pointer (modulo
>>>>>>>>>>>>>> the size). We
>>>>>>>>>>>>>> >> >> >>>> > need
>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>> >> >> >>>> > resize
>>>>>>>>>>>>>> >> >> >>>> > when the two pointers bump into each other.
>>>>>>>>>>>>>> Could you explain
>>>>>>>>>>>>>> >> >> >>>> > more
>>>>>>>>>>>>>> >> >> >>>> > about
>>>>>>>>>>>>>> >> >> >>>> > your suggestion of introducing an arraylet that
>>>>>>>>>>>>>> is shared by the
>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>> >> >> >>>> > and
>>>>>>>>>>>>>> >> >> >>>> > the rear?
>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>> >> >> >>>> It was a half-baked idea - I don't know if there's
>>>>>>>>>>>>>> a way to turn
>>>>>>>>>>>>>> >> >> >>>> it
>>>>>>>>>>>>>> >> >> >>>> into
>>>>>>>>>>>>>> >> >> >>>> something useful. I was thinking of the ArrayDeque
>>>>>>>>>>>>>> >> >> >>>> implementation,
>>>>>>>>>>>>>> >> >> >>>> where all the elements live in a single array.
>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>> >> >> >>>> > It's not clear to me how that would help and/or
>>>>>>>>>>>>>> be a better
>>>>>>>>>>>>>> >> >> >>>> > approach than the cyclic list. Anyhow, the paper
>>>>>>>>>>>>>> that you
>>>>>>>>>>>>>> >> >> >>>> > reference,
>>>>>>>>>>>>>> >> >> >>>> > "Resizable arrays in optimal time and space",
>>>>>>>>>>>>>> gives a deque so
>>>>>>>>>>>>>> >> >> >>>> > if
>>>>>>>>>>>>>> >> >> >>>> > we
>>>>>>>>>>>>>> >> >> >>>> > take
>>>>>>>>>>>>>> >> >> >>>> > that approach then the deque is specified.
>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>> >> >> >>>> Technically, ArrayList also supports the Deque
>>>>>>>>>>>>>> operations -
>>>>>>>>>>>>>> >> >> >>>> just not efficiently.
>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100423/bdcc69d1/attachment.html>
More information about the core-libs-dev
mailing list