RFR: 8072727 - add variation of Stream.iterate() that's finite
Brian Goetz
brian.goetz at oracle.com
Mon Feb 15 16:57:52 UTC 2016
The model that Tagir implemented has a natural analogue: the for loop.
for (T v = seed; predicate.test(v); v = op.apply(v)) { … }
Like a for-loop, this proposed version of iterate() can generate an empty sequence; if the predicate applied to the seed returns false, you’ll get an empty stream.
There are lots of for-loops that might want to be converted to streams, but don’t fit into the existing factories:
for (T p = root; p != null; p = p.next) { … }
converts naturally to
Stream.iterate(root, p -> p != null; p -> p.next)…
Providing an initial value is not at odds with an empty sequence; it still gets tested by the predicate.
On Feb 14, 2016, at 8:03 PM, Stuart Marks <stuart.marks at oracle.com> wrote:
> Hi Tagir,
>
> Thanks for picking this up.
>
> I'll be at a conference this week and I won't have much time to discuss this in detail until afterward. But here are some quick thoughts about the proposal.
>
> I'd suggest focusing on the API first before worrying about how to track the stream state with booleans, etc. Is the API convenient to use, and how well does it support the use cases we envision for it?
>
> In particular, I can imagine a number of cases where it would be very helpful to be able to support an empty stream, or where the computation to produce the first element is the same as the computation to produce subsequent elements. Requiring a value for the first stream element is at odds with that.
>
> Here are some ideas for use cases to try out:
>
> - a series of dice rolls representing a round of craps [1]
> - elements drawn from a queue until the queue is empty or until
> a sentinel is reached
> - a sequence of numbers that (probably) terminates but whose length
> isn't necessarily known in advance (e.g. Collatz sequence [2])
>
> [1] https://en.wikipedia.org/wiki/Craps
>
> [2] https://en.wikipedia.org/wiki/Collatz_conjecture
>
> Note that in some cases the sentinel value that terminates the stream should be part of the stream, and in other cases it's not.
>
> I'm sure you can find more uses cases by perusing Stack Overflow. :-)
>
> I'm a bit skeptical of the use of "iterate" for producing a finite stream. There are the usual issues with overloading, but there's also potential confusion as some forms of iterate() are infinite and others finite. I'll suggest the name "produce" instead, but there are surely better terms.
>
> One thing to think about is where the state of the producer is stored. Is it expected to be in an argument that's passed to each invocation of the functional argument, or is it expected to be captured? I don't think there's an answer in isolation; examining use cases would probably shed some light here.
>
> Here are a few API ideas (wildcards elided):
>
> --
>
> <T> Stream<T> iterate(T seed, Predicate<T> predicate, UnaryOperator<T> f)
>
> The API from your proposal, for comparison purposes.
>
> --
>
> <T> Stream<T> produce(Supplier<Optional<T>>)
>
> Produces elements until empty Optional is returned. This box/unboxes every element, maybe(?) alleviated by Valhalla.
>
> --
>
> <T> Stream<T> produce(BooleanSupplier, Supplier<T>)
>
> Calls the BooleanSupplier; if true the next stream element is what's returned by calling the Supplier. If BooleanSupplier returns false, end of stream. If you have an iterator already, this enables
>
> produce(iterator::hasNext, iterator::next)
>
> But if you don't have an iterator already, coming up with the functions to satisfy the iterator-style protocol is sometimes painful.
>
> --
>
> <T> Stream<T> produce(Predicate<Consumer<T>> advancer)
>
> This has an odd signature, but the function is like Spliterator.tryAdvance(). It must either call the consumer once and return true, or return false without calling the consumer.
>
> --
>
> <T> Stream<T> produce(Consumer<Consumer<T>> advancer)
>
> A variation of the above, without a boolean return. The advancer calls the consumer one or more times to add elements to the stream. End of stream occurs when the advancer doesn't call the consumer.
>
> --
>
> <T> Stream<T> produce(Supplier<Stream<T>>)
>
> A variation of Supplier<Optional<T>> where the supplier returns a stream containing zero or more elements. The stream terminates if the supplier returns an empty stream. There "boxing" overhead here, but we don't seem to be bothered by this with flatMap().
>
> --
>
> s'marks
>
>
> On 2/14/16 6:53 AM, Tagir F. Valeev wrote:
>> Hello!
>>
>> I wanted to work on foldLeft, but Brian asked me to take this issue
>> instead. So here's webrev:
>> http://cr.openjdk.java.net/~tvaleev/webrev/8072727/r1/
>>
>> I don't like iterator-based Stream source implementations, so I made
>> them AbstractSpliterator-based. I also implemented manually
>> forEachRemaining as, I believe, this improves the performance in
>> non-short-circuiting cases.
>>
>> I also decided to keep two flags (started and finished) to track the
>> state. Currently existing implementation of infinite iterate() does
>> not use started flag, but instead reads one element ahead for
>> primitive streams. This seems wrong to me and may even lead to
>> unexpected exceptions (*). I could get rid of "started" flag for
>> Stream.iterate() using Streams.NONE, but this would make object
>> implementation different from primitive implementations. It would also
>> be possible to keep single three-state variable (byte or int,
>> NOT_STARTED, STARTED, FINISHED), but I doubt that this would improve
>> the performance or footprint. Having two flags looks more readable to
>> me.
>>
>> Currently existing two-arg iterate methods can now be expressed as a
>> partial case of the new method:
>>
>> public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f) {
>> return iterate(seed, x -> true, f);
>> }
>> (same for primitive streams). I may do this if you think it's
>> reasonable.
>>
>> I created new test class and added new iterate sources to existing
>> data providers.
>>
>> Please review and sponsor!
>>
>> With best regards,
>> Tagir Valeev.
>>
>> (*) Consider the following code:
>>
>> int[] data = {1,2,3,4,-1};
>> IntStream.iterate(0, x -> data[x])
>> .takeWhile(x -> x >= 0)
>> .forEach(System.out::println);
>>
>> Currently this unexpectedly throws an AIOOBE, because
>> IntStream.iterate unnecessarily tries to read one element ahead.
>>
More information about the core-libs-dev
mailing list