stream() / parallel()
Brian Goetz
brian.goetz at oracle.com
Tue Nov 27 10:39:10 PST 2012
The "bun" operations of stream() and parallel() on Collections and
friends (which currently live in Streamable, but I am not sure that
Streamable carries its weight) work acceptably well for collections:
c.stream().ops(...)
and
c.parallel().ops(...)
The same thing works OK with the stream factories in Streams (which is
intended for use by libraries, not end-users):
Stream.stream(spliterator, flags)
Stream.parallel(spliterator, flags)
Arrays.stream(array)
Arrays.parallel(array)
But we start to see friction in the other places where we surface streams:
IntStream String.chars()
IntStream intRange(int start, int limit)
Stream<String> BufferedReader.lines()
<T> Stream<T> Streams.iterate(T seed, UnaryOperator<T> op)
All of these are, currently, sequential streams. But it seems an
unfortunate limitation that they are forced to be sequential.
Duplicating these methods with parallel versions (i.e.,
String.charsAsParallelStream) is possible but seems kind of yucky.
Recent improvements in the Stream implementation (including the
strictures against "re-use") have created another possibility: an
explicit parallel() "op" on Streams.
If this appears at the head of the stream, the performance cost is
almost zero (a few extra garbage objects at setup time), and on a stream
that is already parallel, it would be a no-op ("if (isParallel) return
this").
At the use site, it would look like:
String.chars().parallel().forEach(...);
or
int[] mapped
= Streams.intRange(0, 1000000).parallel().map(...).toArray();
Collections and other things that currently implement Streamable could
additionally offer a convenience method to fuse stream+parallel, which
is both slightly more compact and slightly more efficient:
c.stream()...
c.parallelStream()...
(The goal here is not to encourage fine-grained "transitions" between
parallel and sequential; this is likely to be a performance loss.)
More information about the lambda-libs-spec-observers
mailing list