Encounter order

Mon Oct 22 23:07:19 PDT 2012

Inline.

On Mon, Oct 22, 2012 at 10:24 PM, Brian Goetz <brian.goetz at oracle.com>wrote:

> I don't have any assumptions about ordering after .parallel().
>>
>> list.parallel().toArray() is analogous to a parallel-ized for-loop.
>>   (See Parallel.ForEach in C#.)
>>
>
> Certainly, list.parallel().forEach() is analogous to a parallelized
> for-loop.
>
>
>  If I wanted it to execute in order, I wouldn't parallel() it.
>>
>
> But I didn't anything about *executing* in order.  The question was, to
> what extent should order be preserved.
>
> For example, array.parallel().map().**toArray() parallelizes perfectly
> while preserving encounter order, though does not execute in order.

I didn't know that.

In PLINQ, for example, asParallel() is unordered, but asOrdered() can be
added to preserve the order.

http://msdn.microsoft.com/en-us/library/dd460677.aspx

(But I am not advocating for asOrdered.)

>  Or, if I wanted to "preserve" the order, I would zip-with-index before
>> parallel execution.
>>
>
> Does that mean you assume that reducing functions passed to reduce are
> necessarily commutative?
>
> What about
>
>   array.parallel().sorted().**toArray()
>
> ?  Would you expect the result to appear in the array in sorted order?
>

After .parallel(), I would make no assumptions about order, but after
.sorted(), I would expect the order to be sorted.

I'm curious to know what other people think.

>
>
>> --Joe
>>
>> On Mon, Oct 22, 2012 at 9:05 PM, Brian Goetz <brian.goetz at oracle.com
>> <mailto:brian.goetz at oracle.com**>> wrote:
>>
>>     For parallel calculations, there are several different things we
>>     might mean by order.  In the context of bulk operations, I'll (try
>>     to) use these terms:
>>
>>       - Encounter order.  A source is said to have an encounter order
>>     when the order in which elements are yielded by iterator / forEach
>>     is predictable and meaningful.  (This is a generalization of spatial
>>     order, since some sources may be purely generative rather than
>>     representing data structures.)
>>
>>     Sources such as arrays, lists, queues, sorted collections, and IO
>>     channels have an encounter order; sources like HashSets or the key
>>     set of a HashMap do not (though their implementation may have a
>>     predictable iteration order anyway.)
>>
>>       - Arrival order.  Arrival order (aka temporal order) is the time
>>     order in which elements may arrive at a stage of an operation
>>     pipeline.  In sequential calculations, the encounter order is
>>     generally the arrival order, but in parallel calculations may not
>>     be.  Sometimes this is good, sometimes not.
>>
>>     It is easy to track (via the stream flags mechanism) whether a
>>     stream source has (or more precisely, supports) an encounter order,
>>     and similarly easy to track whether an operation preserves that
>>     order.  We also don't start any processing until we've seen the
>>     whole pipeline.
>>
>>     Some operations have intrinsic semantic constraints that require us
>>     to produce a result that is consistent with processing the elements
>>     in encounter order.  For example, a fold/reduce operation across a
>>     list using an associative but not commutative operation.  (Though
>>     this can still be parallelized efficiently.)  Others should probably
>>     be constrained to preserve order just because this is consistent
>>     with user expectations (e.g., applying the map function x -> x*2 to
>>     [ 1, 2, 3 ] should probably yield [ 2, 4, 6 ], not [ 4, 6, 2 ].)
>>     Other operations, such as forEach, may have neither constraint.
>>
>>     Like any other constraint, requiring processing to be done
>>     consistently with encounter order has a cost.
>>
>>     Finally, some operations have targets (like into or toArray), which
>>     might themselves support or not support an encounter order.
>>
>>     What I am trying to accomplish here is to identify reasonable user
>>     expectations about order preservation for the (source, operation,
>>     target) combinations we have, and whether we want to / need to add
>>     additional operations to give users more control over ordering.
>>
>>     Here are some starter questions.  Let "list" contain the numbers
>>     1..10.  What should the following expressions compute?
>>
>>        list.parallel().toArray()
>>
>>        list.parallel().filter(e -> e > 5).toArray();
>>
>>        list.parallel().into(new ArrayList<>());
>>
>>        List target = new ArrayList();
>>        list.parallel().forEach(e -> target.add(e));
>>
>>     I think the first one "obviously" should result in an array [ 1..10
>>     ].  And the last should "obviously" not be constrained at all as to
>>     order, only that it print the numbers 1 through 10 in some order.
>>     The third one, though, isn't completely obvious what the right
>>     answer is.
>>
>>     What about this:
>>
>>        Map m = list.parallel().groupBy(e -> e % 2 == 0);
>>
>>     This partitions the list into two parts; should the elements in each
>>     part be in the encounter order from the original list?  (In other
>>     words, should this result in { 0 -> [ 2, 4, 6, 8, 10 ], 1 -> [ 1, 3,
>>     5, 7, 9 ] } ?)
>>
>>     The model I'm targeting is: IF the source has an encounter order AND
>>     the operation must preserve encounter order, then we need to be
>>     sensitive to encounter order, otherwise anything goes.  Is this the
>>     right model?  If so, what are the preservation settings for each
>>     operation?
>>
>>     (A secondary consideration is that there are often situations where
>>     the computation structurally suggests an encounter order, but the
>>     user doesn't actually care, and then is paying for ordering he
>>     doesn't want.  But there are fixes for this.)
>>
>>     Lots of more detailed questions but I'll let people respond to these.
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20121022/ba320850/attachment.html