Spliterator implementations

Wed Jan 2 17:32:34 PST 2013

On 01/02/13 20:15, Brian Goetz wrote:
>> We'd need to document goodness (or good-enough-ness) in the
>> same way we document other algorithmic properties of JDK collections.
>> For anything hopelessly listy (almost all queues, LinkedList,
>> LinkedHashSet, etc) we'd want to clearly indicate O(n) splitting
>> that would be worthwhile only if each function application to each
>> element is very expensive. Not documenting these will surely lead
>> to justifiable bug reports.
>
> Right.  And similarly the defaults in Collection/List/etc have this property.
>
> Note that our default "split an iterator" algorithm is not quite as bad as the
> obvious O(n);

Yeah, but it is still O(n) :-), and you can't even do that unless
the spliterator/collection has a stable size/estimate. So nearly all
concurrent queues, for example ConcurrentLinkedQueue, are in
the really hopeless category...

>
>
> We can upgrade that to "definitely won't be."  For example, I have a hard time
> seeing people using ArrayBlockingQueue as a stream source with any frequency.
> So while there are plenty that could in principle be faster, engineering reality
> suggests many never will be -- and that's OK.
>

The messiest cases are NavigableMap.subMap views (all the
flavors -- ascend/descend, head/tail/range, key/value/entry).
In JDK classes (TreeMap and CSLM), they don't know their sizes
and unlike their top-levels, don't necessarily have regular shapes
to leverage for splitting. Without some added bookkeeping that would
slow down other existing usages, they might join the hopeless category.

-Doug