Substream applied to unordered streams
Paul Sandoz
paul.sandoz at oracle.com
Tue Mar 19 03:47:50 PDT 2013
On Mar 19, 2013, at 10:54 AM, Andrew Haley <aph at redhat.com> wrote:
> On 03/16/2013 08:52 PM, Brian Goetz wrote:
>> I haven't yet determined whether we want to commit to stable sorting.
>> But I think we probably do not. Stability is a relatively cheap
>> property to preserve in a sequential sort (which is why Arrays.sort is
>> spec'ed as stable), but a relatively expensive property to preserve in
>> an parallel sort, and this moves the balance of whether stability is
>> worth the cost.
>
> Why? Merge sorting is usually stable, and parallel sorting is Xsort
> followed by a bunch of merges. What is it about parallel sorting that
> makes stability expensive?
>
> I guess there's something I must be missing. Stability is an
> extremely useful property.
>
Note that the stream implementation currently defers to Arrays.parallelSort(T[]) which states:
* <p>This sort is not guaranteed to be <i>stable</i>: equal elements
* may be reordered as a result of the sort.
The following might shed some light on the matter:
http://arxiv.org/pdf/1202.6575v3.pdf
"Binary search is used to divide the two input sequences into disjoint sequences that can be merged
pairwise independently. Binary searches are performed in parallel for a small selection of distinguished
elements from the two input sequences. A separate, parallel merge of distinguished and located elements
is needed to determine pairs of subsequences to merge. Often, such algorithms are not naturally stable,
and take extra space to be made stable."
I admit to only reading the abstract and introduction on the first page :-)
Hth,
Paul.
More information about the lambda-dev
mailing list