Fwd: Re: Stream.limit() - puzzler/bug/feature

Brian Goetz brian.goetz at oracle.com
Thu Nov 15 11:45:14 PST 2012


So, there's a couple of questions here.

  - Linear stream chain, but multiple use.  For example:

    Stream<T> s = ...
    T first = s.findFirst();
    T second = s.findFirst();

Should this be legal?  It seems harmless in this example, and the serial 
implementation is trivial, but making guarantees about what state a 
stream is left in after a terminal operation is a rats nest for very 
little value.  (If you want to get at the elements in a way that the 
built-in ops can't support, the escape hatch is "iterator" or 
"spliterator".)

Preference: calling a terminal op on a stream, or asking for an 
iterator/spliterator, "seals" the stream and later use of that stream is 
an error.  Not clear we can enforce this at acceptable cost, so the 
fallback is "results are undefined and implementations are encourage to 
free any resources held."

  - Nonlinear stream graph.  For example:

    Stream<T> s = ...
    Stream<T> a = s.filter(...);
    Stream<T> b = s.map(...);
    // use a and b

We don't want to go out of our way to support this as it constrains our 
ability to optimize using lookahead and such in the common case.  Again, 
ideally the above should probably be an error, but not clear if we can 
enforce that perfectly/economically.

On 11/15/2012 2:16 PM, Remi Forax wrote:
> This sprang up on the lambda-dev list and Brian ask me to transfer it to
> the EG list,
> given that the last messages only implied people of this EG, I have
> copy/pasted these messages.
>
> The problem is how to deal with stream that are created from IO objects ?
> Should the implementation throws a runtime exception if such a stream
> are iterated twice
> by a forEach by example ?
>
> Rémi
>
>
> Brian wrote:
>  > A related question is what should happen in this case:
>  >
>  >  T first = stream.findFirst();
>  >  T second = stream.findFirst();
>  >
>  >This "accidentally" works in the current serial impl, but is in
> general a nightmare.  Terminal ops should probably "close" the stream.
>
>
> Sam wrote:
>
>> This was my thinking when I read the example. Not sure if that is
>> practical but it might reduce errors such as the one described.
>>
>> Sam
>
> On Nov 15, 2012, at 9:36 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>
>> On 11/15/2012 06:22 PM, Brian Goetz wrote:
>>> The best way to think about it is that a Stream is more like an Iterator
>>> than a data structure.  There is some abstract source of data somewhere
>>> (it might be in a data structure, or might be generated from a function,
>>> or read from a network), and a series of transformations applied to the
>>> data between the source and the consumer.  Streams can additionally
>>> execute using parallelism, if requested.
>>>
>>> Stream constructs like:
>>>
>>> Stream<Person> s = people.stream()
>>>                           .filter(p -> p.getLastName().equals("Smith")))
>>>
>>> do not do any filtering on construction.  It simply says "there's a
>>> stream source, the collection 'people', and when you consume from the
>>> stream s, you'll get the results of filtering the source values."
>>>
>>> The confusion in Dmitry's example is akin to multiple activities reading
>>> from the same IO channel -- they might interfere with each other over
>>> who gets the next value, and any buffering that any consumer does may
>>> confuse other consumers.
>>
>> Maybe the implementation should protect users to use two aliases of a
>> non-replayable stream.
>> Using the example of Dmitry, if the stream is an IO channel, the second
>> call to limit() or to any method of 's' should throw an
>> IllegalStateException.
>>
>> Rémi
>>
>


More information about the lambda-libs-spec-observers mailing list