Can non terminal operation mark stream as ordered without actually sorting it ?

Paul Sandoz paul.sandoz at oracle.com
Mon Apr 8 01:06:03 PDT 2013


Hi Boaz,

You are trying to impute an arbitrary encounter order in the map operation. If run in parallel the framework has no idea what those indexes will be. The map operation will encounter elements in a non-deterministic temporal order, which basically means the order in which threads concurrently report elements. In this case it does not matter here if the source has an encounter order or not. 

Given the source has no encounter order then any associative reduce-based operations may produce arbitrary results over multiple runs, regardless of whether you impute order or not.

Thus rather than doing a sort it would be just as effective to collect to a list, where the index of an element is its position in the list.

So, a general question i have is after you have arbitrarily associated an indexes with elements what do you intend to do with those indexes/elements?

--

Other solutions:

- use zip, to zip up the stream with an integer range.

- try some experiments by wrapping Spliterator e.g. one that takes a Spliterator<T> and wraps in a Spliterator<Pair<Integer, T>>.

Paul.

On Apr 7, 2013, at 5:55 PM, Boaz Nahum <boaznahum at gmail.com> wrote:

> Hi.
> 
> I have parallel unordered stream.
> 
> In some point I convert the stream to Pair<Integer, T> where *Integer  *is
> index of element in stream.
> It is quite clear that each element is assigned arbitrary index, but once
> assigned I want it to be ordered by this index.
> 
> Sure I can do:
> 
> si.sequential().map((t)->new Pair<>(index.incrementAndGet(), t));
> 
> But I already learnt that I cant count on 'sequential()' ...   :)
> 
> Or sort it:
> si.map((t)->new Pair<>(index.incrementAndGet(), t)).sorted((t1,
> t2)->t1.first.compareTo(t2.first));
> 
> But why pay the cost of sorting.
> 
> Or collect it to 'ordered' collection to protect 'sequential':
> si.sequential().map((t)->new Pair<>(index.incrementAndGet(),
> t)).collect(collector).stream();
> 
> here I pay the price of allocation Collection to hold all elements.
> 
> 
> What i really need is somethings like this:
> 
> si.*ordered*().map((t)->new Pair<>(index.incrementAndGet(), t));
> 
> But unlike 'sequential()/parallel()' It cant be global.
> 
> Does making stream 'ordered' as better/worse performance than converting it
> to stream->collection->stream ?
> 
> I know it is rare use, But still I'm wondering.
> 
> Thanks
> Boaz
> 



More information about the lambda-dev mailing list