Spliterator implementations
Brian Goetz
brian.goetz at oracle.com
Wed Jan 2 17:15:42 PST 2013
> We'd need to document goodness (or good-enough-ness) in the
> same way we document other algorithmic properties of JDK collections.
> For anything hopelessly listy (almost all queues, LinkedList,
> LinkedHashSet, etc) we'd want to clearly indicate O(n) splitting
> that would be worthwhile only if each function application to each
> element is very expensive. Not documenting these will surely lead
> to justifiable bug reports.
Right. And similarly the defaults in Collection/List/etc have this
property.
Note that our default "split an iterator" algorithm is not quite as bad
as the obvious O(n); we serve up a sequence of increasing-sized splits,
so that we can immediately generate some work to fork while the pool is
warming up. So first split might be of size 1, second of size 2, etc,
up until we hit some threshold and flatten out.
> Conversely most array-based ones (including most hash tables)
> can document best-case O(n/p) performance (p is parallelism level)
> because of constant split time. Others somewhere in the middle.
> Of those others, there may be those that could in principle be
> faster but might not be for JDK8 ship?
We can upgrade that to "definitely won't be." For example, I have a
hard time seeing people using ArrayBlockingQueue as a stream source with
any frequency. So while there are plenty that could in principle be
faster, engineering reality suggests many never will be -- and that's OK.
More information about the lambda-libs-spec-experts
mailing list