Performance model of sorted() ?

Brian Goetz brian.goetz at oracle.com
Fri Nov 1 11:18:19 PDT 2013


The "slicing" part is right, but the single-threaded assumption is wrong.

What happens is that the upstream stateless ops will be done in 
parallel, with the results being collected at the leaves of the 
computation tree, into a bunch of mini-buffers.  Now, each will be 
sorted (in parallel), and then we begin a parallel merge, collecting the 
results into an array.  Then the downstream stateless ops and the 
terminal op are run, in parallel.

So, no truly sequential steps imposed by the framework (though binary 
parallel merges are generally dominated by the last merge step.)

On 11/1/2013 2:01 PM, Millies, Sebastian wrote:
> Hello there,
>
> in a talk at JAX 2013 the processing model for sorting a stream with sorted() was described as follows.
> I’d like to know if that presentation is really correct:
>
> .parallelStream().statelessOps() // upstream slice
> .sorted() // sorted slice
> .statelessOps().terminal() // downstream slice
>
> • stream is sliced
> • resulting stream is buffered at the end of upstream slice (barrier)
> • downstream slice is started after the upstream slice is finished
> • sorted-slice is run by a single thread only
>
> I wonder particularly about the statement saying that the sorting is done by a single thread, after all
> there is parallel array sort etc. Can anyone clarify the performance model of sorted() ?
>
> -- Sebastian
>
>
> Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Karl-Heinz Streibich (Vorsitzender/Chairman), Dr. Wolfram Jost, Arnd Zinnhardt; - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Dr. Andreas Bereczky - http://www.softwareag.com
>
>


More information about the lambda-dev mailing list