Forked streams
Richard Warburton
richard.warburton at gmail.com
Fri Nov 29 04:41:33 PST 2013
Hi,
Recently on this list Stuart mentioned Kirk's blog entry on lambdas, which
> has some discussion on forked streams:
>
> https://weblogs.java.net/blog/kcpeppe/archive/2013/11/10/fun-lambdas
>
> To get across the issues and have a little fun i wrote a simple stream
> forker.
>
> The basic approach is to create a recipe to fork a stream multiple times,
> apply that recipe to a stream, and collect the results. The results of each
> fork are encapsulated in instances of CompletableFuture. Each fork is
> executed asynchronously. The stream to fork is operated on using forEach
> with a consumer that pushes an element to each fork via a
> LinkedBlockingQueue. A forked stream's source is a special Spliterator
> covering the elements of the associated LinkedBlockingQueue.
>
> It's not very complicated, quite rough, and i am sure just one of many
> ways if doing this, but it does highlight the complexities/issues here.
>
> What does it mean for parallel streams? If a parallel forEach is performed
> on the stream to be forked that means elements will be pushed to the forked
> streams out of order. Resources for parallel execution of a stream will
> complete with concurrent execution of the forked streams, which may
> complication the performance characteristics.
>
> Each fork represents a separate computation, which may be formulated
> differently for parallel execution. The computation of toArray is different
> to that of findFirst.
>
> I can think of improvements to the current API that could enable better
> sequential forking, but for parallel forking it will be particularly tricky
> to fork the whole computation across thread boundaries and a questionable
> endeavour IMHO.
>
> So in general this can be complex, there are different models of execution
> competing for attention, and streams has currently opted for a simpler
> model that is easy to grok that with a bit of work can be utilized with
> forking if one is careful. There are probably ways we could improve that
> for any further streams work, perhaps with an SPI supporting user-defined
> operations where one could write a fork operation that behaves like a
> hybrid intermediate/terminal operation.
>
tl;dr whilst the core Java streams library may not be the right place to
support forkable streams, you can use RxJava to get what you want.
Another thing to mention on the front of forkable streams is the RxJava
project which brings reactive programming to the JVM. [0] They have a
concept called an Observable, that in many ways acts like stream. The
underlying threading model is quite different though, I won't go into
details but if you're interested in project lambda, then you should also
take a look at RxJava.
They have a way of connecting observables [1] which means elements in one
observable can be multicasted to other observables. The differences in the
underlying implementation between streams and observables mean that while
the idea of multicasting values from one stream into another brings a
series of complications that Paul has outlined, they don't really exist in
RxJava. Which is why I suggest that if you want that feature, maybe RxJava
is the library that you really should be using rather than Streams.
regards,
Richard Warburton
http://insightfullogic.com
@RichardWarburto <http://twitter.com/richardwarburto>
[0] https://github.com/Netflix/RxJava
[1] https://github.com/Netflix/RxJava/wiki/Connectable-Observable-Operators
More information about the lambda-dev
mailing list