Fwd: Re: Stream.limit() - puzzler/bug/feature
Brian Goetz
brian.goetz at oracle.com
Thu Nov 15 11:45:14 PST 2012
So, there's a couple of questions here.
- Linear stream chain, but multiple use. For example:
Stream<T> s = ...
T first = s.findFirst();
T second = s.findFirst();
Should this be legal? It seems harmless in this example, and the serial
implementation is trivial, but making guarantees about what state a
stream is left in after a terminal operation is a rats nest for very
little value. (If you want to get at the elements in a way that the
built-in ops can't support, the escape hatch is "iterator" or
"spliterator".)
Preference: calling a terminal op on a stream, or asking for an
iterator/spliterator, "seals" the stream and later use of that stream is
an error. Not clear we can enforce this at acceptable cost, so the
fallback is "results are undefined and implementations are encourage to
free any resources held."
- Nonlinear stream graph. For example:
Stream<T> s = ...
Stream<T> a = s.filter(...);
Stream<T> b = s.map(...);
// use a and b
We don't want to go out of our way to support this as it constrains our
ability to optimize using lookahead and such in the common case. Again,
ideally the above should probably be an error, but not clear if we can
enforce that perfectly/economically.
On 11/15/2012 2:16 PM, Remi Forax wrote:
> This sprang up on the lambda-dev list and Brian ask me to transfer it to
> the EG list,
> given that the last messages only implied people of this EG, I have
> copy/pasted these messages.
>
> The problem is how to deal with stream that are created from IO objects ?
> Should the implementation throws a runtime exception if such a stream
> are iterated twice
> by a forEach by example ?
>
> Rémi
>
>
> Brian wrote:
> > A related question is what should happen in this case:
> >
> > T first = stream.findFirst();
> > T second = stream.findFirst();
> >
> >This "accidentally" works in the current serial impl, but is in
> general a nightmare. Terminal ops should probably "close" the stream.
>
>
> Sam wrote:
>
>> This was my thinking when I read the example. Not sure if that is
>> practical but it might reduce errors such as the one described.
>>
>> Sam
>
> On Nov 15, 2012, at 9:36 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>
>> On 11/15/2012 06:22 PM, Brian Goetz wrote:
>>> The best way to think about it is that a Stream is more like an Iterator
>>> than a data structure. There is some abstract source of data somewhere
>>> (it might be in a data structure, or might be generated from a function,
>>> or read from a network), and a series of transformations applied to the
>>> data between the source and the consumer. Streams can additionally
>>> execute using parallelism, if requested.
>>>
>>> Stream constructs like:
>>>
>>> Stream<Person> s = people.stream()
>>> .filter(p -> p.getLastName().equals("Smith")))
>>>
>>> do not do any filtering on construction. It simply says "there's a
>>> stream source, the collection 'people', and when you consume from the
>>> stream s, you'll get the results of filtering the source values."
>>>
>>> The confusion in Dmitry's example is akin to multiple activities reading
>>> from the same IO channel -- they might interfere with each other over
>>> who gets the next value, and any buffering that any consumer does may
>>> confuse other consumers.
>>
>> Maybe the implementation should protect users to use two aliases of a
>> non-replayable stream.
>> Using the example of Dmitry, if the stream is an IO channel, the second
>> call to limit() or to any method of 's' should throw an
>> IllegalStateException.
>>
>> Rémi
>>
>
More information about the lambda-libs-spec-observers
mailing list