Combining streams

Paul Sandoz paul.sandoz at oracle.com
Tue Jul 30 00:33:44 PDT 2013


On Jul 29, 2013, at 9:48 PM, Michael Hixson <michael.hixson at gmail.com> wrote:
> Hi Paul,
> 
> Thanks, that all makes sense.  Incidentally, after staring at the
> Stream.concat implementation for a while, I believe I've spotted a
> minor bug.
> 
> http://hg.openjdk.java.net/lambda/lambda/jdk/file/943bf0f62463/src/share/classes/java/util/stream/Streams.java
> 
>  687  unsized = aSpliterator.hasCharacteristics(SIZED)
>  688            && aSpliterator.hasCharacteristics(SIZED)
>  689            && aSpliterator.estimateSize() +
> bSpliterator.estimateSize() < 0;
> 
> Should line 688 have "bSpliterator" instead of "aSpliterator"?
> 

Well spotted!

I have logged a bug, will fix with some more tests to verify unsized.


> Feel free to ignore the rest of this email.  I don't ask any more
> questions, I just explain where I ended up in case anyone is curious.
> 
> I tried out a parallel reduce+concat and it threw an exception.  I
> guess I was misusing the identity parameter of reduce.  


Ah, the problem is Stream.empty is not immutable (or in effect constant for the duration of the reduction). Kicking myself for not recognizing that earlier on. The action of obtaining the spliterator from the empty stream causes it to be consumed.


> Instead of
> this:
> 
>  Stream.of(s1, s2, s3, ...)
>      .parallel()
>      .reduce(Stream.empty(), Stream::concat)
> 
> I need to do this:
> 
>  Stream.of(s1, s2, s3, ...)
>      .parallel()
>      .reduce(Stream::concat)
>      .orElseGet(Stream::empty)
> 
> I tried writing my own spliterator for concat and it only did slightly
> better than the others.  That is,
> 
>  A. custom spliterator-based varargs concat
>  B. balanced tree varargs concat (from my previous email)
>  C. parallel reduce + concat
> 
> A > B > C, by just a little bit, for every variation of top-level
> stream size, substream size, and parallel-or-sequential that I tried.
> In terms of lines of additional library code (minus documentation),
> 
>  A. ~300 lines
>  B. ~80 lines
>  C. 0 lines
> 
> So I think I will stick with C. :)
> 

:-) thanks very much for your investigations.

For any var-arg concat in the JDK we would probably have to go with an A-like solution, thankfully if we do that, most logic is independent of the shape of stream (object or primitive).

Paul.


More information about the lambda-dev mailing list