Streams.generate: infinite or finite?

Jim Mayer jim at pentastich.org
Fri Apr 12 14:18:00 PDT 2013


Perhaps I'm missing something, but what makes it hard to do a
high performance parallel split of an infinite stream?  Sure, you can't
break it up into even chunks, but is that really all that different than
breaking up Long.MAX_VALUE values?  On an 8 core machine you're not going
to break them up into 2^60 element chunks either, right?

Yours confusedly,

Jim Mayer


On Fri, Apr 12, 2013 at 10:50 AM, Paul Sandoz <paul.sandoz at oracle.com>wrote:

> Hi,
>
> Currently Streams.generate produces an infinite stream. This is
> theoretically nice but splits poorly (right-balanced trees).
>
> Implementation-wise Streams.generate creates a spliterator from an
> iterator:
>
>     public static<T> Stream<T> generate(Supplier<T> s) {
>         Objects.requireNonNull(s);
>         InfiniteIterator<T> iterator = s::get;
>         return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
>                 iterator,
>                 Spliterator.ORDERED | Spliterator.IMMUTABLE));
>     }
>
> The method is used in java.util.Random:
>
>     public IntStream ints() {
>         return Streams.generateInt(this::nextInt);
>     }
>
> There might be a nasty surprise in store for developers that expect the
> randomly generated stream of int values to have the best parallel
> performance.
>
>
> We can change Streams.generate to be finite (or not know to be finite in
> the time allotted to do some computation) by implementing as follows:
>
>     public static<T> Stream<T> generate(Supplier<T> s) {
>       return Streams.longRange(0, Long.MAX_VALUE).mapToObj(i -> s.get());
>     }
>
> This will yield better parallel performance because the splits are
> balanced.
>
> We can further change to:
>
>     public static<T> Stream<T> generate(Supplier<T> s) {
>       return Streams.longs().mapToObj(i -> s.get());
>     }
>
> if we introduce the longs() idiom.
>
>
> I think we should go finite! and add Streams.longs().  Agree? or disagree?
>
> Then it is actually questionable if Streams.generate should exist at all.
> It does have some pedagogic value since the idiom Streams.longs().map() may
> not be obvious. So i would be mostly inclined to keep it for that reason.
>
> Paul.


More information about the lambda-libs-spec-observers mailing list