Streams.generate: infinite or finite?
Jim Mayer
jim at pentastich.org
Fri Apr 12 14:18:00 PDT 2013
Perhaps I'm missing something, but what makes it hard to do a
high performance parallel split of an infinite stream? Sure, you can't
break it up into even chunks, but is that really all that different than
breaking up Long.MAX_VALUE values? On an 8 core machine you're not going
to break them up into 2^60 element chunks either, right?
Yours confusedly,
Jim Mayer
On Fri, Apr 12, 2013 at 10:50 AM, Paul Sandoz <paul.sandoz at oracle.com>wrote:
> Hi,
>
> Currently Streams.generate produces an infinite stream. This is
> theoretically nice but splits poorly (right-balanced trees).
>
> Implementation-wise Streams.generate creates a spliterator from an
> iterator:
>
> public static<T> Stream<T> generate(Supplier<T> s) {
> Objects.requireNonNull(s);
> InfiniteIterator<T> iterator = s::get;
> return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
> iterator,
> Spliterator.ORDERED | Spliterator.IMMUTABLE));
> }
>
> The method is used in java.util.Random:
>
> public IntStream ints() {
> return Streams.generateInt(this::nextInt);
> }
>
> There might be a nasty surprise in store for developers that expect the
> randomly generated stream of int values to have the best parallel
> performance.
>
>
> We can change Streams.generate to be finite (or not know to be finite in
> the time allotted to do some computation) by implementing as follows:
>
> public static<T> Stream<T> generate(Supplier<T> s) {
> return Streams.longRange(0, Long.MAX_VALUE).mapToObj(i -> s.get());
> }
>
> This will yield better parallel performance because the splits are
> balanced.
>
> We can further change to:
>
> public static<T> Stream<T> generate(Supplier<T> s) {
> return Streams.longs().mapToObj(i -> s.get());
> }
>
> if we introduce the longs() idiom.
>
>
> I think we should go finite! and add Streams.longs(). Agree? or disagree?
>
> Then it is actually questionable if Streams.generate should exist at all.
> It does have some pedagogic value since the idiom Streams.longs().map() may
> not be obvious. So i would be mostly inclined to keep it for that reason.
>
> Paul.
More information about the lambda-libs-spec-observers
mailing list