A List implementation backed by multiple small arrays rather than the traditional single large array.

Sat Apr 24 00:31:14 UTC 2010

Hi Benedict,

I took a look at your index decomposition routine; it was not working for
seed = 1 until I made index the query index plus one (similar to my r
variable) and arrayIndex ((firstArrayOfThisSize + arrayOffset) &
Integer.MAX_VALUE) - 1 (notice I'm subtracting one).  Once I made these
changes the routine was actually slower than my version (although not by
much).  Let me know if you have a better way to bring your routine in line
with the arraylet structure.

Kevin

On Fri, Apr 23, 2010 at 2:26 PM, Benedict Elliott Smith <lists at laerad.com>wrote:

> Hi Kevin,
>
> Unfortunately this week has been pretty hectic, and I haven't had much time
> to much more than theorise on this topic - and this weekend the weather
> looks set to be much too nice to stay in doors! It looks like you've made
> really good progress on the ChunkedArrayDeque - I haven't fully digested it
> yet, but it looks pretty impressive. I'm not sure (since I misunderstood
> your list implementation prior to reading it in detail) but I think I may
> have been toying with a similar growth strategy for a hash map variant since
> the copies are necessary there anyway.
>
> I have taken another look at the index algorithm and have tidied it up as
> below, and it now supports a seed size of 1; however I am not sure that
> using a value this small is advisable, given that the overhead for the first
> array is at least 60 bytes; with a seed size of 1 the first 8 indexes would
> utilise less than 20% of the allocated memory for data, the first 24 less
> than 25%. I would have thought a seed of at least 2 and perhaps 3 would be
> advisable.
>
> As an aside, it is worth noting that the indexOffset calculation below can
> be replaced with index & (_backingArray[arrayIndex].length - 1) , although I
> am not certain what effect this will have on performance, given that this
> would be another memory operation to retrieve the length from the array; but
> it would make the separation of the calculation into functions more straight
> forward.
>
>      final int arraySizeShiftMinusSeed = (31 -
> Integer.numberOfLeadingZeros(index >>> seed)) >>> 1 ;
>      final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
>
>      final int firstArrayOfThisSize =
>       (1 << arraySizeShiftMinusSeed + 2)
>      - (1 << arraySizeShiftMinusSeed)
>      - 1 - (arraySizeShift >>> 31) ;
>      final int indexRemainder = index - (1 << arraySizeShift <<
> arraySizeShiftMinusSeed) ;
>
>      final int arrayOffset = indexRemainder >>> arraySizeShift ;
>      final int arrayIndex = (firstArrayOfThisSize + arrayOffset) &
> Integer.MAX_VALUE ;
>
>      final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
>
>
>
> On 23 April 2010 11:00, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>
>> Hi Benedict,
>>
>> Have you had a chance to get your index decomposition procedure to work
>> with seed values less than two?
>>
>> Kevin
>>
>>
>> On Sat, Apr 17, 2010 at 11:48 AM, Benedict Elliott Smith <
>> lists at laerad.com> wrote:
>>
>>> Hi Kevin,
>>>
>>> As it happens I might have something useful still to contribute. As an
>>> exercise in saving face I revisited the problem to see if I could achieve
>>> the same complexity bounds as ChunkedArrayList but with a lower overhead. I
>>> must admit I still didn't fully appreciate how the algorithm in
>>> ChunkedArrayList worked until I tried to come up with an algorithm with
>>> similar properties. What I have ended up with is almost identical except
>>> adds I think a couple of incremental improvements, simply by redefining the
>>> arrayIndex() method. I should note that I have not yet implemented more than
>>> a prototype as it seems to me your implementation is excellent already, and
>>> if it is decided to include my modifications the changes should be modest.
>>>
>>> Firstly, (I hope that) what I have produced is a little more CPU
>>> pipe-line friendly; there is less dependency on immediately preceding
>>> calculations at each stage (i.e. so more operations should be able to
>>> proceed simultaneously in the pipeline), and consists exclusively of shifts,
>>> addition/subtraction and bit-wise (&)ands (except for the conditionals in
>>> Integer.numberOfLeadingZeros(i)), although the total number of instructions
>>> is approximately the same.
>>>
>>> Secondly, I have modified the algorithm so that a "seed" size can be
>>> specified (although I expect hard coding a suitable one will ultimately be
>>> best). Whereas ChunkedArrayList currently requires that the pattern of array
>>> allocation sizes be [1, 1, 2, 2, 2, 4(..*6), 8(..*12), 16(..*24)] we can now
>>> support, for some "*s*", [*s*(..*2), 2*s*(..*3), 4*s*(..*6), 8*s*(..*12),
>>> 16*s*(..*24)] etc. although when put in simple text like that it does
>>> appear to trivialise the change. The benefit of this, though, is two fold:
>>> 1) for small n the constant factor is reduced (both CPU and memory wise);
>>> and 2) the sqrt(n) bounds are reached more quickly also.
>>>
>>> As an illustration, consider setting *s* to 4, and assume the backing
>>> array is size two and doubles in size with each growth; with
>>> ChunkedArrayList we would resize at i=2, i=6, i=20, i=72; with *s* as 4
>>> we would instead resize at i=8,i=24,i=80,i=288; the cost at each would be
>>> some multiple of 2,4,8,16 respectively. As you can see the latter is much
>>> closer to the sqrt(n) cost - both approach it eventually, but my suggestion
>>> is to reach it more quickly. This is at the expense of more slowly reaching
>>> the sqrt(n) wasted memory condition, but given the high constant factor cost
>>> wrt to memory at this early stage, this seems a very sensible trade off. It
>>> seems likely this should also have a positive impact on cache performance
>>> for smaller lists as well.
>>>
>>> Finally, after playing with this idea in my head I am confident I can
>>> extend the core ideas of this data structure to hashing relatively easily,
>>> getting the the same worst case O(sqrt(n)) insertion cost, and O(sqrt(n))
>>> wasted memory guarantees. I notice that this case hasn't been addressed yet,
>>> although I see from Martin's recent mail that this was raised before. Unless
>>> there are better suggestions for solving the hash table problem I will have
>>> a go at it as it seems an interesting problem - that is, assuming there are
>>> no objections?
>>>
>>> I'm interested to hear your thoughts. I hope this time I've been a bit
>>> more considered in what I've put forward, and hence less of a waste of time!
>>>
>>> Code snippet for calculation of array index and item offset:
>>>
>>> final int arraySizeShiftMinusSeed = ((31 -
>>> Integer.numberOfLeadingZeros(index >>> seed)) >>> 1) ;
>>> final int arraySizeShift = arraySizeShiftMinusSeed + seed ;
>>>  final int firstArrayOfThisSize = ((((1 << arraySizeShiftMinusSeed + 3)
>>> - (1 << arraySizeShiftMinusSeed + 1))) >>> 1) - 1 ;
>>>  final int indexRemainder = index - ((1 << seed) <<
>>> arraySizeShiftMinusSeed + arraySizeShiftMinusSeed) ;
>>> final int arrayOffset = indexRemainder >>> arraySizeShift ;
>>>
>>> final int arrayIndex = firstArrayOfThisSize + arrayOffset ;
>>> final int itemIndex = index & ((1 << arraySizeShift) - 1) ;
>>>
>>> the first array size will be 1 << seed - 1 (i.e. seed is equal to *s* +
>>> 1); seed only works for values for 2 or more at this moment, fyi
>>>
>>>
>>>
>>> On 16 April 2010 00:18, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
>>>
>>>> Oh no worries Benedict, thanks for your interest in the topic.  Let me
>>>> know if you have any other questions or if you have any related ideas or
>>>> concerns.
>>>>
>>>>
>>>> On Thu, Apr 15, 2010 at 8:00 AM, Benedict Elliott Smith <
>>>> lists at laerad.com> wrote:
>>>>
>>>>> Sorry Kevin - it sounds like I might be being of more hindrance than
>>>>> help. that part of the discussion was clearly truncated by the time I had
>>>>> joined the list - I haven't been able to find the history in the archives
>>>>> either...
>>>>>
>>>>> I was just wondering about the worst case cost of add() as described by
>>>>> your javadoc; admittedly it is optimal with respect to unused memory, but
>>>>> the worst case cost of an add is still sqrt(n), with a relatively high
>>>>> constant factor. I had been thinking that once n passed a threshold the cost
>>>>> of additions in this other structure would become a constant factor,
>>>>> offering nice algorithmic complexity guarantees for large n; however since
>>>>> sqrt(Integer.MAX_VALUE) is ~46,000, the maximum size of new array
>>>>> allocations would have to be unrealistically small (assuming linear cost for
>>>>> allocation) for this to be the case. It would still be nice to have a data
>>>>> structure that avoids needing to copy data with each grow, whilst still
>>>>> maintaining good memory performance.
>>>>>
>>>>> That *all* being said, I had been going by your javadoc and emails to
>>>>> ascertain the behaviour of this class, as I couldn't locate a free copy
>>>>> of [Brodnik99resizablearrays], and it seems this was a bad idea; as the
>>>>> sqrt(n) cost appears to be associated with growing the backing array, rather
>>>>> than with what I assumed to be copying data between arraylets, and it seems
>>>>> this cost is pretty optimal. That will teach me to post to a list without
>>>>> getting my facts straight first. The interesting thing is simply that the
>>>>> constant factor for this implementation still seems to be quite high,
>>>>> although perhaps that is simply because I was not benchmarking sufficiently
>>>>> large values of n.
>>>>>
>>>>>
>>>>>
>>>>> On 15 April 2010 12:12, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>>>>>
>>>>>> Hi Benedict,
>>>>>>
>>>>>> Unless I am misreading your post, this now has a very similar feel to
>>>>>> the first data structure that I posted to the list.  Martin Buchholz then
>>>>>> pointed out that we can incorporate the ideas from
>>>>>> [Brodnik99resizablearrays] and reap additional benefits.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 15, 2010 at 4:07 AM, Benedict Elliott Smith <
>>>>>> lists at laerad.com> wrote:
>>>>>>
>>>>>>> Hi Kevin,
>>>>>>>
>>>>>>> Yes, as I was going to bed last night I realised I had not fully
>>>>>>> addressed the problem that was originally being visited; only reduced the
>>>>>>> constant factor for addition to the end of the list. A trivial modification
>>>>>>> fixes that, however; same scheme but up to some maximum arraylet size (of a
>>>>>>> power of 2), after which the array is increased in size linearly.
>>>>>>> Performance doesn't seem to have been affected appreciably, although not
>>>>>>> been exhaustive in the benchmarking:
>>>>>>>
>>>>>>> 10 items inserts versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>>> 10 items inserts Chunked / ExpArray = 0.99
>>>>>>> 10 items get versus ArrayList: Chunked=1.15, ExpArray=1.16
>>>>>>> 10 items get Chunked / ExpArray = 0.99
>>>>>>> 100 items inserts versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>>> 100 items inserts Chunked / ExpArray = 1.23
>>>>>>> 100 items get versus ArrayList: Chunked=1.24, ExpArray=1.01
>>>>>>> 100 items get Chunked / ExpArray = 1.23
>>>>>>> 1000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>> 1000 items inserts Chunked / ExpArray = 1.19
>>>>>>> 1000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>> 1000 items get Chunked / ExpArray = 1.19
>>>>>>> 10000 items inserts versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>> 10000 items inserts Chunked / ExpArray = 1.18
>>>>>>> 10000 items get versus ArrayList: Chunked=1.22, ExpArray=1.03
>>>>>>> 10000 items get Chunked / ExpArray = 1.18
>>>>>>> 100000 items inserts versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>>> 100000 items inserts Chunked / ExpArray = 1.09
>>>>>>> 100000 items get versus ArrayList: Chunked=0.82, ExpArray=0.75
>>>>>>> 100000 items get Chunked / ExpArray = 1.09
>>>>>>>
>>>>>>> The nice thing about this is that the maximum amount of wasted memory
>>>>>>> is user configurable. Even with a low setting as above (65K) performance
>>>>>>> seems pretty consistent.
>>>>>>>
>>>>>>> Code for calculating index and array offset are pretty straight
>>>>>>> forward; haven't given much thought to optimisations just yet:
>>>>>>>
>>>>>>> private final int indexFor(int a, int i) {
>>>>>>> return 1 + i - (a > maxArrayIndex ? (1 + a - maxArrayIndex) <<
>>>>>>> maxArraySizeShift : 1 << a) ;
>>>>>>>  }
>>>>>>> private final int arrayFor(int i) {
>>>>>>> return i >= (maxArraySize << 1) ? (i + 1 >>> maxArraySizeShift) +
>>>>>>> maxArrayIndex - 1 : 31 - Integer.numberOfLeadingZeros(i + 1) ;
>>>>>>>  }
>>>>>>>
>>>>>>> Regarding the double list idea - yes, I agree, I certainly didn't
>>>>>>> think that one through fully!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 15 April 2010 02:44, Kevin L. Stern <kevin.l.stern at gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi Benedict,
>>>>>>>>
>>>>>>>> Like you, I am relatively new to this mailing list; I am also trying
>>>>>>>> to tread lightly so as not to step on any toes.  That being said, I think
>>>>>>>> that I can offer a response to your inquiry.
>>>>>>>>
>>>>>>>> Regarding:  "The idea is to simply double the new array size each
>>>>>>>> time a new array needs to be allocated"
>>>>>>>>
>>>>>>>> It seems this would not address the desire of offering an
>>>>>>>> alternative to the allocation of a large backing array for ArrayList (your
>>>>>>>> largest backing array could still reach a size of 1/2 * Integer.MAX_VALUE)
>>>>>>>> and would not address the desire of wasting the (asymptotically) minimum
>>>>>>>> amount of memory in the worst case while maintaining O(1) amortized time
>>>>>>>> bounds.  The data structure described in [Brodnik99resizablearrays] has a
>>>>>>>> maximum backing array size of sqrt(n) and caps wasted memory at sqrt(n).
>>>>>>>> What advantage over ArrayList do you see in your data structure?
>>>>>>>>
>>>>>>>> Regarding:  "Also, with regard to a Deque implementation, it seems
>>>>>>>> that the simplest solution would be to simply have two lists, with one
>>>>>>>> accepting inserts for near the beginning and being ordered in reverse whilst
>>>>>>>> the other accepted inserts for near to the end."
>>>>>>>>
>>>>>>>> What happens with your structure when you add n elements and then
>>>>>>>> remove element 0 n times?  I think that once you work out all the kinks
>>>>>>>> you'll end up with the two stacks approach, which is mentioned in
>>>>>>>> [Brodnik99resizablearrays] and which I mentioned in an earlier email, or
>>>>>>>> you'll end up with the circular list approach, which is not friendly to O(1)
>>>>>>>> amortized time bounds in a data structure that resizes more often than O(n)
>>>>>>>> due to the 'unshift' to the front = 0 position.  I think the best approach
>>>>>>>> is the one mentioned in [Brodnik99resizablearrays], which is the approach
>>>>>>>> that I am currently working on.  Incidentally, this approach also provides
>>>>>>>> for a much improved index unpacking procedure using only bit shifts and bit
>>>>>>>> masks, although it is at the expense of (O(1)) additional work during
>>>>>>>> resize.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 14, 2010 at 4:42 PM, Benedict Elliott Smith <
>>>>>>>> lists at laerad.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I hope you don't consider it rude to involve myself in this
>>>>>>>>> conversation towards the end - I joined the mailing list only recently.
>>>>>>>>>
>>>>>>>>> I'm not sure if this offers a huge amount to the discussion, but I
>>>>>>>>> have tinkered with a "chunked" array list which seems to offer better time
>>>>>>>>> performance in general at the cost of greater (worst case) memory
>>>>>>>>> utilisation. It is easier to understand IMHO as well, although this is not
>>>>>>>>> necessarily a great benefit here. It turns out the idea is very similar to
>>>>>>>>> the one implemented already by Kevin, though; but perhaps simpler. The idea
>>>>>>>>> is to simply double the new array size each time a new array needs to be
>>>>>>>>> allocated, or in effect allocate an array that is the size of all existing
>>>>>>>>> arrays put together. With this scheme the calculation for array and offset
>>>>>>>>> are really very straight forward ( floor(log(i)) and 1 + i -
>>>>>>>>> 2^floor(log(i))) ). Memory utilisation is the same as for ArrayList, but
>>>>>>>>> obviously inserts at the end are much quicker.
>>>>>>>>>
>>>>>>>>> I have prototyped the data structure this evening and benchmarked
>>>>>>>>> additions at the end of the list, for which the performance is pretty
>>>>>>>>> impressive.
>>>>>>>>>
>>>>>>>>> Some random statistics for addition only on the client JVM (I have
>>>>>>>>> quickly dubbed my implementation ExpArrayList)
>>>>>>>>> All statistics were run in two rounds with ~1000 runs per round per
>>>>>>>>> statistic per list, and the second round results were used.
>>>>>>>>>
>>>>>>>>> 10 items versus ArrayList: Chunked=1.14, ExpArray=1.02
>>>>>>>>> 10 items Chunked / ExpArray = 1.12
>>>>>>>>> 100 items versus ArrayList: Chunked=1.20, ExpArray=0.82
>>>>>>>>> 100 items Chunked / ExpArray = 1.45
>>>>>>>>> 1000 items versus ArrayList: Chunked=1.03, ExpArray=0.51
>>>>>>>>> 1000 items Chunked / ExpArray = 2.02
>>>>>>>>> 10000 items versus ArrayList: Chunked=0.88, ExpArray=0.49
>>>>>>>>> 10000 items Chunked / ExpArray = 1.79
>>>>>>>>> 100000 items versus ArrayList: Chunked=0.32, ExpArray=0.20
>>>>>>>>> 100000 items Chunked / ExpArray = 1.64
>>>>>>>>>
>>>>>>>>> and server JVM:
>>>>>>>>> 10 items versus ArrayList: Chunked=1.00, ExpArray=1.16
>>>>>>>>> 10 items Chunked / ExpArray = 0.86
>>>>>>>>> 100 items versus ArrayList: Chunked=1.29, ExpArray=0.96
>>>>>>>>> 100 items Chunked / ExpArray = 1.34
>>>>>>>>> 1000 items versus ArrayList: Chunked=1.16, ExpArray=0.92
>>>>>>>>> 1000 items Chunked / ExpArray = 1.27
>>>>>>>>> 10000 items versus ArrayList: Chunked=0.93, ExpArray=0.84
>>>>>>>>> 10000 items Chunked / ExpArray = 1.12
>>>>>>>>> 100000 items versus ArrayList: Chunked=0.71, ExpArray=0.65
>>>>>>>>> 100000 items Chunked / ExpArray = 1.10
>>>>>>>>>
>>>>>>>>> Interestingly insertion at the beginning of the list appears to be
>>>>>>>>> quicker with ExpArrayList, at least on the server JVM, whereas I would have
>>>>>>>>> expected them to be fairly close.
>>>>>>>>> Amazingly ExpArrayList is faster even than ArrayList for insertion
>>>>>>>>> at the beginning of large lists, which I haven't yet tried to understand.
>>>>>>>>> Insertion in the middle is similar.
>>>>>>>>>
>>>>>>>>> 10 items versus ArrayList: Chunked=9.82, ExpArray=3.80
>>>>>>>>> 10 items Chunked / ExpArray = 2.59
>>>>>>>>> 100 items versus ArrayList: Chunked=7.30, ExpArray=3.41
>>>>>>>>> 100 items Chunked / ExpArray = 2.14
>>>>>>>>> 1000 items versus ArrayList: Chunked=2.83, ExpArray=1.09
>>>>>>>>> 1000 items Chunked / ExpArray = 2.59
>>>>>>>>> 10000 items versus ArrayList: Chunked=1.56, ExpArray=0.72
>>>>>>>>> 10000 items Chunked / ExpArray = 2.16
>>>>>>>>>
>>>>>>>>> Finally, there are promising results for get() from the
>>>>>>>>> ExpArrayList as well (server JVM), again somehow beating ArrayList for
>>>>>>>>> larger lists:
>>>>>>>>> 10 items get versus ArrayList: Chunked=1.27, ExpArray=1.16
>>>>>>>>> 10 items get Chunked / ExpArray = 1.10
>>>>>>>>> 100 items get versus ArrayList: Chunked=1.45, ExpArray=1.17
>>>>>>>>> 100 items get Chunked / ExpArray = 1.25
>>>>>>>>> 1000 items get versus ArrayList: Chunked=1.42, ExpArray=1.07
>>>>>>>>> 1000 items get Chunked / ExpArray = 1.33
>>>>>>>>> 10000 items get versus ArrayList: Chunked=1.26, ExpArray=1.02
>>>>>>>>> 10000 items get Chunked / ExpArray = 1.24
>>>>>>>>> 100000 items get versus ArrayList: Chunked=1.05, ExpArray=0.86
>>>>>>>>> 100000 items get Chunked / ExpArray = 1.22
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm willing to explore this further but I'm not sure how desirable
>>>>>>>>> that is, given that Kevin's data structure appears to perform pretty well
>>>>>>>>> already wrt to CPU time, and better wrt to memory utilisation, and in effect
>>>>>>>>> this mostly changes only the function to determine which array to use, not
>>>>>>>>> the body of the implementation. Let me know if you would like a copy of the
>>>>>>>>> source code and I will find somewhere to upload it.
>>>>>>>>>
>>>>>>>>> Also, with regard to a Deque implementation, it seems that the
>>>>>>>>> simplest solution would be to simply have two lists, with one accepting
>>>>>>>>> inserts for near the beginning and being ordered in reverse whilst the other
>>>>>>>>> accepted inserts for near to the end. The only trick would be having the
>>>>>>>>> list at the beginning support iteration in reverse order cheaply, but this
>>>>>>>>> could easily be achieved by creating an extension of List with a
>>>>>>>>> reverseIterator() method.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Anyway, not sure if this helped at all but fancied joining in...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 14 April 2010 12:25, Joe Kearney <joe.j.kearney at googlemail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Hi Kevin,
>>>>>>>>>>
>>>>>>>>>> It implements List, as well as Deque. It is indeed based on
>>>>>>>>>> ArrayDeque, with the added operations to implement list. It does so
>>>>>>>>>> reasonably efficiently, moving the fewest elements possible on each
>>>>>>>>>> operation, that is zero for the queue operations, at most n/2 for the rest
>>>>>>>>>> and all of them for a backing array resize.
>>>>>>>>>>
>>>>>>>>>> The idea is to get a replacement for arraylist that performs like
>>>>>>>>>> arraydeque on remove(0). As a side effect, we should be able to get better
>>>>>>>>>> performance on other operations by requiring fewer elements to be moved.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Joe
>>>>>>>>>>
>>>>>>>>>> 2010/4/14 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>>
>>>>>>>>>> Hi Joe,
>>>>>>>>>>>
>>>>>>>>>>> I was referring to the ChunkedArrayList when I stated that add
>>>>>>>>>>> does not amortize to constant time when the data structure employs the
>>>>>>>>>>> circular list trick to achieve deque behavior; ChunkedArrayList potentially
>>>>>>>>>>> resizes every n^(1/2) operations.
>>>>>>>>>>>
>>>>>>>>>>> Regarding your CircularArrayList, does it differ from Java's
>>>>>>>>>>> ArrayDeque?  I took only a cursory look at it, so please understand if I
>>>>>>>>>>> have missed your reason for creating CircularArrayList altogether.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Kevin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Apr 13, 2010 at 6:52 AM, Joe Kearney <
>>>>>>>>>>> joe.j.kearney at googlemail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Kevin, Martin,
>>>>>>>>>>>>
>>>>>>>>>>>> To add another discussion point, I've been writing a
>>>>>>>>>>>> draft/proof-of-concept of retrofitting the List interface onto ArrayDeque.
>>>>>>>>>>>> This works over the raw array, it doesn't use the fancier structures being
>>>>>>>>>>>> discussed elsewhere on this list that deal with splitting huge arrays into
>>>>>>>>>>>> arraylets, or that provide for O(1) insert in the middle.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://code.google.com/p/libjoe/source/browse/trunk/src/joe/collect/CircularArrayList.java
>>>>>>>>>>>>
>>>>>>>>>>>> I'd be interested if you have any comments in the context of
>>>>>>>>>>>> this discussion. The code is not entirely ready yet, a couple of tests fail
>>>>>>>>>>>> (6/789) because of a corner case I haven't nailed yet, but the idea is there
>>>>>>>>>>>> at least. I'd like to add array shrinking later, when the size dips below
>>>>>>>>>>>> capacity*0.4 perhaps, to avoid flickering up and down around...
>>>>>>>>>>>>
>>>>>>>>>>>> Tests show performance to be close to ArrayList for the O(1)
>>>>>>>>>>>> operations. Timings for indexed reads and writes showed
>>>>>>>>>>>> no discernible difference between implementations last time I ran the
>>>>>>>>>>>> tests. I don't understand at the moment why the iterator add at index
>>>>>>>>>>>> size/3, size/2 perform 30% slower than ArrayList on smaller lists, nor the
>>>>>>>>>>>> dodgy numbers for ArrayList.insert(5), I'll look at this soon.  Those
>>>>>>>>>>>> operations that become O(1) in a circular implementation (that are
>>>>>>>>>>>> implemented and tested here) are faster than in ArrayList. Insert/remove in
>>>>>>>>>>>> the middle are somewhat faster than ArrayList because we only have to copy
>>>>>>>>>>>> at most half of the elements, except when resizing the array.
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin, I don't fully understand your point about not amortizing
>>>>>>>>>>>> to O(1). Certainly that's true for insert not at head or tail. Otherwise
>>>>>>>>>>>> this implementation only moves array elements to the front on an array
>>>>>>>>>>>> resize operation which happens every O(ln n) operations at most, if we do
>>>>>>>>>>>> lots of adds, maybe a little more if we add array shrinking too.  This is
>>>>>>>>>>>> the same as ArrayList. Are you just referring to the add-in-the-middle case?
>>>>>>>>>>>>
>>>>>>>>>>>> Some performance results below, code for these is in the
>>>>>>>>>>>> repository above too. This was the second run, after a warmup.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Joe
>>>>>>>>>>>>
>>>>>>>>>>>> ------------------------------------------------
>>>>>>>>>>>> CircularArrayList ------------------------------------------------
>>>>>>>>>>>>   size         add         get         set   iterAdd/3
>>>>>>>>>>>> iterAdd/2   insert(5)   removeRnd   removeMid   remove(0)
>>>>>>>>>>>>      10          20          67          70         125
>>>>>>>>>>>> 102          90         240         191         138
>>>>>>>>>>>>    100          19          67          70         166
>>>>>>>>>>>> 138          94         230         194         118
>>>>>>>>>>>>   1000          28          64          67         681
>>>>>>>>>>>> 538          91         324         382         119
>>>>>>>>>>>>   10000          30          65          67        5884
>>>>>>>>>>>>  4425          94        1296        2330         124
>>>>>>>>>>>> ---------------------------------------------------- ArrayList
>>>>>>>>>>>> ----------------------------------------------------
>>>>>>>>>>>>   size         add         get         set   iterAdd/3
>>>>>>>>>>>> iterAdd/2   insert(5)   removeRnd   removeMid   remove(0)
>>>>>>>>>>>>      10          23          68          70         100
>>>>>>>>>>>>  69       32913         162         130         105
>>>>>>>>>>>>    100          20          67          70         129
>>>>>>>>>>>> 104       21944         169         134         135
>>>>>>>>>>>>   1000          29          63          67         651
>>>>>>>>>>>> 506        9602         364         333         526
>>>>>>>>>>>>   10000          30          63          66        5878
>>>>>>>>>>>>  4414        9947        2312        2280        4437
>>>>>>>>>>>>
>>>>>>>>>>>> 2010/4/13 Kevin L. Stern <kevin.l.stern at gmail.com>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had intended to address your request for absolute O(1)
>>>>>>>>>>>>> operations in the previous email.  The approach to achieving this suggested
>>>>>>>>>>>>> in [Brodnik99resizablearrays] is tantamount to making ArrayList operations
>>>>>>>>>>>>> absolute O(1) by keeping around an array of size (3/2)*n and filling it with
>>>>>>>>>>>>> a constant number of entries from the main array each time add is called.
>>>>>>>>>>>>> Although this distributes the work done during a resize across the n
>>>>>>>>>>>>> operations required to enter a resize-required state, it is at the expense
>>>>>>>>>>>>> of additional memory usage and slower add operations.  My thought is that
>>>>>>>>>>>>> this would be a fine approach for a real-time application that requires hard
>>>>>>>>>>>>> guarantees on performance but would be a liability in so many Java
>>>>>>>>>>>>> applications that do not require these hard guarantees.  I look forward to
>>>>>>>>>>>>> hearing your thoughts on the matter, though.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <
>>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's interesting to note that the old circular list trick will
>>>>>>>>>>>>>> not suffice to turn this data structure into a deque since we might be
>>>>>>>>>>>>>> copying all n elements back to the front = 0 position every n^(1/2)
>>>>>>>>>>>>>> operations (add wouldn't amortize to O(1)).  We could use the old two stacks
>>>>>>>>>>>>>> trick (push elements onto one stack, flip (the bottom) half (of) the
>>>>>>>>>>>>>> elements to the 'other' stack when the 'other' stack becomes empty),
>>>>>>>>>>>>>> mentioned in [Brodnik99resizablearrays], but I find this to be a bit CS
>>>>>>>>>>>>>> 101.  In [Brodnik99resizablearrays] the authors suggest a method for making
>>>>>>>>>>>>>> all blocks roughly the same size, allowing us to expand/shrink capacity at
>>>>>>>>>>>>>> the beginning or the end; this is the approach that I will take to create a
>>>>>>>>>>>>>> deque.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The FAQ for the Sun Contributor Agreement Q3 (
>>>>>>>>>>>>>> http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3)
>>>>>>>>>>>>>> indicates that one should check with the project to determine where the SCA
>>>>>>>>>>>>>> should be sent.  Do you know where I would find this information?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @MISC{Brodnik99resizablearrays,
>>>>>>>>>>>>>>     author = {Andrej Brodnik and Svante Carlsson and Erik D.
>>>>>>>>>>>>>> Demaine and J. Ian Munro and Robert Sedgewick},
>>>>>>>>>>>>>>     title = {Resizable Arrays in Optimal Time and Space},
>>>>>>>>>>>>>>     year = {1999}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <
>>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for your continuing work on this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I like the test results, and agree with your analysis.
>>>>>>>>>>>>>>> I'm especially happy that you're beating
>>>>>>>>>>>>>>> ArrayList at some operations.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd like to see O(1) addition at the beginning,
>>>>>>>>>>>>>>> implement both List and Deque (I regret
>>>>>>>>>>>>>>> our not having done this with ArrayDeque).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> An additional property that would be nice to
>>>>>>>>>>>>>>> have (but don't try too hard)
>>>>>>>>>>>>>>> is to provide some kind of real-time
>>>>>>>>>>>>>>> guarantees on the cost of an individual operation,
>>>>>>>>>>>>>>> not just amortized time.  E.g. ArrayList.add
>>>>>>>>>>>>>>> is worst-case O(n), making it unsuitable for use
>>>>>>>>>>>>>>> in some real-time applications.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will help get your changes into the obvious
>>>>>>>>>>>>>>> software distributions.  I assume you're happy
>>>>>>>>>>>>>>> with having this class included in any of
>>>>>>>>>>>>>>> Doug Lea's jsr166, guava-libraries, or the JDK itself.
>>>>>>>>>>>>>>> You should sign a Sun contributor agreement,
>>>>>>>>>>>>>>> or whatever the Oracle equivalent is,
>>>>>>>>>>>>>>> if you have not done so yet.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Doug Lea likes public domain,
>>>>>>>>>>>>>>> guava-libraries likes the Apache license.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We should get various people a chance to give
>>>>>>>>>>>>>>> a thumbs up on the design of this class -
>>>>>>>>>>>>>>> Doug Lea, Josh Bloch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <
>>>>>>>>>>>>>>> kevin.l.stern at gmail.com> wrote:
>>>>>>>>>>>>>>> > Hello Martin,
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > I spent some time this weekend trying to bring out bugs in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> > implementation; I believe the latest version to be in
>>>>>>>>>>>>>>> decent shape.  I have
>>>>>>>>>>>>>>> > also gathered some data on the performance of
>>>>>>>>>>>>>>> ChunkedArrayList over
>>>>>>>>>>>>>>> > ArrayList using the latest 1.6 JDK, which I've included
>>>>>>>>>>>>>>> below (note that the
>>>>>>>>>>>>>>> > numbers represent the time spent performing the specified
>>>>>>>>>>>>>>> operation with
>>>>>>>>>>>>>>> > ChunkedArrayList over the time spent with ArrayList, so
>>>>>>>>>>>>>>> 1.00 indicates
>>>>>>>>>>>>>>> > equivalent performance, < 1.00 indicates that
>>>>>>>>>>>>>>> ChunkedArrayList is less
>>>>>>>>>>>>>>> > costly and > 1.00 indicates that ArrayList is less
>>>>>>>>>>>>>>> costly).  I've noticed
>>>>>>>>>>>>>>> > relatively significant variability in a few of the numbers
>>>>>>>>>>>>>>> when I switch
>>>>>>>>>>>>>>> > hardware; though, these data do seem to represent rough
>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>> > expectations.  For my test I generated x elements and then
>>>>>>>>>>>>>>> timed the process
>>>>>>>>>>>>>>> > of adding them to ArrayList/ChunkedArrayList, then I
>>>>>>>>>>>>>>> performed a get
>>>>>>>>>>>>>>> > operation on each for indices 0 through x-1 and finally I
>>>>>>>>>>>>>>> used the iterator
>>>>>>>>>>>>>>> > mechanism to retrieve the first through xth element (of
>>>>>>>>>>>>>>> course, I performed
>>>>>>>>>>>>>>> > each of these operations multiple times throwing away the
>>>>>>>>>>>>>>> timing for the
>>>>>>>>>>>>>>> > first few iterations to warm up the JVM).
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Regarding the question of whether or not this belongs in
>>>>>>>>>>>>>>> java.util, I would
>>>>>>>>>>>>>>> > suggest that if it is desirable from a GC point of view to
>>>>>>>>>>>>>>> eliminate the
>>>>>>>>>>>>>>> > large backing array from ArrayList then your suggestion of
>>>>>>>>>>>>>>> achieving this by
>>>>>>>>>>>>>>> > way of a data structure that is both time and space optimal
>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>> > particularly elegant solution as it not only guarantees
>>>>>>>>>>>>>>> that no backing
>>>>>>>>>>>>>>> > array will be larger than sqrt(n) elements but it also
>>>>>>>>>>>>>>> provides dynamic
>>>>>>>>>>>>>>> > shrinking behavior, has less maximum memory overhead than
>>>>>>>>>>>>>>> ArrayList, and
>>>>>>>>>>>>>>> > copies (asymptotically) fewer elements during a resize than
>>>>>>>>>>>>>>> ArrayList.  Of
>>>>>>>>>>>>>>> > course, this data structure does not do everything better
>>>>>>>>>>>>>>> than ArrayList; in
>>>>>>>>>>>>>>> > particular, indexed access is more costly, due to the
>>>>>>>>>>>>>>> required decomposition
>>>>>>>>>>>>>>> > of the index into backing array index and offset and the
>>>>>>>>>>>>>>> additional memory
>>>>>>>>>>>>>>> > indirection, and insertion-at-an-index is more costly, due
>>>>>>>>>>>>>>> to the multiple
>>>>>>>>>>>>>>> > array copies necessary to complete the shift.  That being
>>>>>>>>>>>>>>> said, I think that
>>>>>>>>>>>>>>> > the additional cost of indexed access is partially
>>>>>>>>>>>>>>> mitigated by the
>>>>>>>>>>>>>>> > availability of iterator and listIterator, whose
>>>>>>>>>>>>>>> implementations do not use
>>>>>>>>>>>>>>> > the index decomposition procedure, and the additional cost
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> > insertion-at-an-index is partially mitigated by the fact
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> > insertion-at-an-index is already an undesirable operation
>>>>>>>>>>>>>>> on ArrayList due
>>>>>>>>>>>>>>> > to its linear time complexity.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Kevin
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 1000000 elements:
>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.30
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.80
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.52
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.81
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.87
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.31
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 100000 elements:
>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.86
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.48
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.96
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.89
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.68
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 10000 elements:
>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 1.04
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.33
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.53
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.97
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.45
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 2.52
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > 1000 elements:
>>>>>>>>>>>>>>> > Client JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.99
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 2.27
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 0.54
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Server JVM:
>>>>>>>>>>>>>>> > Add to ChunkedArrayList over ArrayList: 0.84
>>>>>>>>>>>>>>> > Indexed access ChunkedArrayList over ArrayList: 1.23
>>>>>>>>>>>>>>> > Iterator ChunkedArrayList over ArrayList: 1.11
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <
>>>>>>>>>>>>>>> martinrb at google.com> wrote:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> My feeling on whether to support O(1) at both ends
>>>>>>>>>>>>>>> >> is that any flavor of this that ends up in the JDK
>>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>> >> should really do this.  My idea is that we can
>>>>>>>>>>>>>>> >> wholeheartedly recommend this collection class
>>>>>>>>>>>>>>> >> for overall good behavior without any of the surprising
>>>>>>>>>>>>>>> >> performance traps of existing collection classes.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> But for the preliminary version, it makes sense to
>>>>>>>>>>>>>>> >> support only O(1) at one end, if it simplifies the
>>>>>>>>>>>>>>> >> implementation.  Random access will of course
>>>>>>>>>>>>>>> >> be worse than ArrayList, but by how much?
>>>>>>>>>>>>>>> >> We can do some benchmarking and look for
>>>>>>>>>>>>>>> >> micro-optimizations now.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Kevin, what is you own personal feeling?
>>>>>>>>>>>>>>> >> Is the algorithm correct, and efficient enough?
>>>>>>>>>>>>>>> >> Do you think your new collection belongs in java.util?
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Martin
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <
>>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>>>>> >> > The data structure is available at the second link that
>>>>>>>>>>>>>>> I originally
>>>>>>>>>>>>>>> >> > provided (once again, it is
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en
>>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>>> >> > This does not have O(1) time insertion at the front as
>>>>>>>>>>>>>>> yet as it was
>>>>>>>>>>>>>>> >> > unclear
>>>>>>>>>>>>>>> >> > to me whether or not it was agreed upon:
>>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>>> >> > From: Osvaldo Doederlein <opinali at gmail.com>
>>>>>>>>>>>>>>> >> > Date: Mon, Mar 29, 2010 at 10:08 AM
>>>>>>>>>>>>>>> >> > Subject: Re: A List implementation backed by multiple
>>>>>>>>>>>>>>> small arrays
>>>>>>>>>>>>>>> >> > rather
>>>>>>>>>>>>>>> >> > than the traditional single large array.
>>>>>>>>>>>>>>> >> > To: Martin Buchholz <martinrb at google.com>
>>>>>>>>>>>>>>> >> > Cc: "Kevin L. Stern" <kevin.l.stern at gmail.com>,
>>>>>>>>>>>>>>> >> > core-libs-dev at openjdk.java.net
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Initially, it would be good enough to replace only
>>>>>>>>>>>>>>> java.util.ArrayList
>>>>>>>>>>>>>>> >> > with
>>>>>>>>>>>>>>> >> > minimal overhead. ArrayList does not support efficient
>>>>>>>>>>>>>>> add-at-front or
>>>>>>>>>>>>>>> >> > other
>>>>>>>>>>>>>>> >> > enhancements of ArrayDeque; but ArrayList is still a
>>>>>>>>>>>>>>> much more important
>>>>>>>>>>>>>>> >> > and
>>>>>>>>>>>>>>> >> > popular collection, it's the primary "straight
>>>>>>>>>>>>>>> replacement for primitive
>>>>>>>>>>>>>>> >> > arrrays" and I guess it should continue with that role.
>>>>>>>>>>>>>>> >> > _________________
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > As a disclaimer, I'm still tinkering with this so I'll
>>>>>>>>>>>>>>> be updating the
>>>>>>>>>>>>>>> >> > document at the provided link as I find improvements.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Thoughts?
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Kevin
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <
>>>>>>>>>>>>>>> martinrb at google.com>
>>>>>>>>>>>>>>> >> > wrote:
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Hi Kevin,
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> You're probably the only one on this list who has
>>>>>>>>>>>>>>> >> >> seriously read the paper.  It is not surprising that
>>>>>>>>>>>>>>> >> >> taking a research paper into production would
>>>>>>>>>>>>>>> >> >> discover bugs - the research never had to undergo
>>>>>>>>>>>>>>> >> >> rigorous testing.  (I like the Java culture of
>>>>>>>>>>>>>>> >> >> combining spec + implementation + test suite)
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> I suggest you ask the authors directly about the bug.
>>>>>>>>>>>>>>> >> >> They would probably also be interested to hear
>>>>>>>>>>>>>>> >> >> about your implementation.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Are you aware of Integer.numberOfLeadingZeros?
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29>
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Martin
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <
>>>>>>>>>>>>>>> kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>> >> >> wrote:
>>>>>>>>>>>>>>> >> >> > I'm almost convinced now that the paper is
>>>>>>>>>>>>>>> incorrect.  The code below
>>>>>>>>>>>>>>> >> >> > gives
>>>>>>>>>>>>>>> >> >> > me the appropriate index into the index array and the
>>>>>>>>>>>>>>> offset into the
>>>>>>>>>>>>>>> >> >> > data
>>>>>>>>>>>>>>> >> >> > block.  That being said, remember when I mentioned
>>>>>>>>>>>>>>> that this will
>>>>>>>>>>>>>>> >> >> > include a
>>>>>>>>>>>>>>> >> >> > bit more work to access an element than a simple bit
>>>>>>>>>>>>>>> shift and a bit
>>>>>>>>>>>>>>> >> >> > mask?
>>>>>>>>>>>>>>> >> >> > Well this is more than a bit more - we'll be doing
>>>>>>>>>>>>>>> this each time an
>>>>>>>>>>>>>>> >> >> > index
>>>>>>>>>>>>>>> >> >> > is requested.  I'll spend some time trying to twiddle
>>>>>>>>>>>>>>> the bits to see
>>>>>>>>>>>>>>> >> >> > if
>>>>>>>>>>>>>>> >> >> > I
>>>>>>>>>>>>>>> >> >> > can eliminate/combine some of the operations.
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> >         for (int r = 1; r < 33; r++) {
>>>>>>>>>>>>>>> >> >> >             int k = lg(r);
>>>>>>>>>>>>>>> >> >> >             int floorKO2 = k >> 1;
>>>>>>>>>>>>>>> >> >> >             int powFloorKO2 = (1 << floorKO2);
>>>>>>>>>>>>>>> >> >> >             int p = ((1 << floorKO2) - 1) << 1;
>>>>>>>>>>>>>>> >> >> >             int ceilKO2;
>>>>>>>>>>>>>>> >> >> >             if ((k & 1) == 1) {
>>>>>>>>>>>>>>> >> >> >                 ceilKO2 = floorKO2 + 1;
>>>>>>>>>>>>>>> >> >> >                 p += powFloorKO2;
>>>>>>>>>>>>>>> >> >> >             } else {
>>>>>>>>>>>>>>> >> >> >                 ceilKO2 = floorKO2;
>>>>>>>>>>>>>>> >> >> >             }
>>>>>>>>>>>>>>> >> >> >             int e = r & ((1 << ceilKO2) - 1);
>>>>>>>>>>>>>>> >> >> >             int b = (r >> ceilKO2) & (powFloorKO2 -
>>>>>>>>>>>>>>> 1);
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> >             System.out.println((r - 1) + " " + (p +
>>>>>>>>>>>>>>> b) + " " + e);
>>>>>>>>>>>>>>> >> >> >         }
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > Kevin
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern
>>>>>>>>>>>>>>> >> >> > <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>> >> >> > wrote:
>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for even
>>>>>>>>>>>>>>> numbered
>>>>>>>>>>>>>>> >> >> >> superblocks,
>>>>>>>>>>>>>>> >> >> >> the odd numbered superblocks need an additional term
>>>>>>>>>>>>>>> added (the
>>>>>>>>>>>>>>> >> >> >> number
>>>>>>>>>>>>>>> >> >> >> of
>>>>>>>>>>>>>>> >> >> >> data blocks in SB_[k-1]) to jive with my
>>>>>>>>>>>>>>> interpretation; anyhow, I
>>>>>>>>>>>>>>> >> >> >> also
>>>>>>>>>>>>>>> >> >> >> came
>>>>>>>>>>>>>>> >> >> >> across an alternative characterization of superblock
>>>>>>>>>>>>>>> in the paper
>>>>>>>>>>>>>>> >> >> >> which
>>>>>>>>>>>>>>> >> >> >> states that data blocks are grouped within a
>>>>>>>>>>>>>>> superblock when they
>>>>>>>>>>>>>>> >> >> >> are
>>>>>>>>>>>>>>> >> >> >> the
>>>>>>>>>>>>>>> >> >> >> same size - to me, though, that implies that my
>>>>>>>>>>>>>>> example structure
>>>>>>>>>>>>>>> >> >> >> below
>>>>>>>>>>>>>>> >> >> >> would be
>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>> >> >> >> SB_0: [1]
>>>>>>>>>>>>>>> >> >> >> SB_1: [2][2][2]
>>>>>>>>>>>>>>> >> >> >> SB_2: [4][4][4][4][4][4]
>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>> >> >> >> which seems to contradict my understanding of (1)
>>>>>>>>>>>>>>> below.  I must be
>>>>>>>>>>>>>>> >> >> >> reading this upside down.
>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern
>>>>>>>>>>>>>>> >> >> >> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>> >> >> >> wrote:
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> What am I missing here?  In "Resizable arrays in
>>>>>>>>>>>>>>> optimal time and
>>>>>>>>>>>>>>> >> >> >>> space"
>>>>>>>>>>>>>>> >> >> >>> the authors define their data structure with the
>>>>>>>>>>>>>>> following
>>>>>>>>>>>>>>> >> >> >>> property:
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> (1)  "When superblock SB_k is fully allocated, it
>>>>>>>>>>>>>>> consists of
>>>>>>>>>>>>>>> >> >> >>> 2^(floor(k/2)) data blocks, each of size
>>>>>>>>>>>>>>> 2^(ceil(k/2))."
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> Since the superblock is zero-based indexed this
>>>>>>>>>>>>>>> implies the
>>>>>>>>>>>>>>> >> >> >>> following
>>>>>>>>>>>>>>> >> >> >>> structure:
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> SB_0: [1]
>>>>>>>>>>>>>>> >> >> >>> SB_1: [2]
>>>>>>>>>>>>>>> >> >> >>> SB_2: [2][2]
>>>>>>>>>>>>>>> >> >> >>> SB_3: [4][4]
>>>>>>>>>>>>>>> >> >> >>> SB_4: [4][4][4][4]
>>>>>>>>>>>>>>> >> >> >>> [...]
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with i
>>>>>>>>>>>>>>> = 3:
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> r = 100 (the binary expansion of i + 1)
>>>>>>>>>>>>>>> >> >> >>> k = |r| - 1 = 2
>>>>>>>>>>>>>>> >> >> >>> p = 2^k - 1 = 3
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> What concerns me is their statement that p
>>>>>>>>>>>>>>> represents "the number
>>>>>>>>>>>>>>> >> >> >>> of
>>>>>>>>>>>>>>> >> >> >>> data
>>>>>>>>>>>>>>> >> >> >>> blocks in superblocks prior to SB_k."  There are
>>>>>>>>>>>>>>> only two data
>>>>>>>>>>>>>>> >> >> >>> blocks
>>>>>>>>>>>>>>> >> >> >>> in
>>>>>>>>>>>>>>> >> >> >>> superblocks prior to SB_2, not three.  Given (1)
>>>>>>>>>>>>>>> above, unless I'm
>>>>>>>>>>>>>>> >> >> >>> misinterpreting it, the number of data blocks in
>>>>>>>>>>>>>>> superblocks prior
>>>>>>>>>>>>>>> >> >> >>> to
>>>>>>>>>>>>>>> >> >> >>> SB_k
>>>>>>>>>>>>>>> >> >> >>> should be:
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1)
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> This, of course, seems to work out much better in
>>>>>>>>>>>>>>> my example above,
>>>>>>>>>>>>>>> >> >> >>> giving the correct answer to my interpretation of
>>>>>>>>>>>>>>> their data
>>>>>>>>>>>>>>> >> >> >>> structure, but
>>>>>>>>>>>>>>> >> >> >>> I have a hard time believing that this is their
>>>>>>>>>>>>>>> mistake rather than
>>>>>>>>>>>>>>> >> >> >>> my
>>>>>>>>>>>>>>> >> >> >>> misinterpretation.
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> Thoughts?
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> Kevin
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz
>>>>>>>>>>>>>>> >> >> >>> <martinrb at google.com>
>>>>>>>>>>>>>>> >> >> >>> wrote:
>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern
>>>>>>>>>>>>>>> >> >> >>>> <kevin.l.stern at gmail.com>
>>>>>>>>>>>>>>> >> >> >>>> wrote:
>>>>>>>>>>>>>>> >> >> >>>> > Hi Martin,
>>>>>>>>>>>>>>> >> >> >>>> >
>>>>>>>>>>>>>>> >> >> >>>> > Thanks much for your feedback.  The first
>>>>>>>>>>>>>>> approach that comes to
>>>>>>>>>>>>>>> >> >> >>>> > mind
>>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>>> >> >> >>>> > implement O(1) time front as well as rear
>>>>>>>>>>>>>>> insertion is to create
>>>>>>>>>>>>>>> >> >> >>>> > a
>>>>>>>>>>>>>>> >> >> >>>> > cyclic
>>>>>>>>>>>>>>> >> >> >>>> > list structure with a front/rear pointer - to
>>>>>>>>>>>>>>> insert at the
>>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>>> >> >> >>>> > requires
>>>>>>>>>>>>>>> >> >> >>>> > decrementing the front pointer (modulo the size)
>>>>>>>>>>>>>>> and to insert
>>>>>>>>>>>>>>> >> >> >>>> > at
>>>>>>>>>>>>>>> >> >> >>>> > the
>>>>>>>>>>>>>>> >> >> >>>> > rear
>>>>>>>>>>>>>>> >> >> >>>> > requires incrementing the rear pointer (modulo
>>>>>>>>>>>>>>> the size).  We
>>>>>>>>>>>>>>> >> >> >>>> > need
>>>>>>>>>>>>>>> >> >> >>>> > to
>>>>>>>>>>>>>>> >> >> >>>> > resize
>>>>>>>>>>>>>>> >> >> >>>> > when the two pointers bump into each other.
>>>>>>>>>>>>>>> Could you explain
>>>>>>>>>>>>>>> >> >> >>>> > more
>>>>>>>>>>>>>>> >> >> >>>> > about
>>>>>>>>>>>>>>> >> >> >>>> > your suggestion of introducing an arraylet that
>>>>>>>>>>>>>>> is shared by the
>>>>>>>>>>>>>>> >> >> >>>> > front
>>>>>>>>>>>>>>> >> >> >>>> > and
>>>>>>>>>>>>>>> >> >> >>>> > the rear?
>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>> >> >> >>>> It was a half-baked idea - I don't know if there's
>>>>>>>>>>>>>>> a way to turn
>>>>>>>>>>>>>>> >> >> >>>> it
>>>>>>>>>>>>>>> >> >> >>>> into
>>>>>>>>>>>>>>> >> >> >>>> something useful.  I was thinking of the
>>>>>>>>>>>>>>> ArrayDeque
>>>>>>>>>>>>>>> >> >> >>>> implementation,
>>>>>>>>>>>>>>> >> >> >>>> where all the elements live in a single array.
>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>> >> >> >>>> >  It's not clear to me how that would help and/or
>>>>>>>>>>>>>>> be a better
>>>>>>>>>>>>>>> >> >> >>>> > approach than the cyclic list.  Anyhow, the
>>>>>>>>>>>>>>> paper that you
>>>>>>>>>>>>>>> >> >> >>>> > reference,
>>>>>>>>>>>>>>> >> >> >>>> > "Resizable arrays in optimal time and space",
>>>>>>>>>>>>>>> gives a deque so
>>>>>>>>>>>>>>> >> >> >>>> > if
>>>>>>>>>>>>>>> >> >> >>>> > we
>>>>>>>>>>>>>>> >> >> >>>> > take
>>>>>>>>>>>>>>> >> >> >>>> > that approach then the deque is specified.
>>>>>>>>>>>>>>> >> >> >>>>
>>>>>>>>>>>>>>> >> >> >>>> Technically, ArrayList also supports the Deque
>>>>>>>>>>>>>>> operations -
>>>>>>>>>>>>>>> >> >> >>>> just not efficiently.
>>>>>>>>>>>>>>> >> >> >>>
>>>>>>>>>>>>>>> >> >> >>
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100423/e0d2070e/attachment.html>