Fwd: Re: Stream.limit() - puzzler/bug/feature
Remi Forax
forax at univ-mlv.fr
Fri Nov 16 04:50:15 PST 2012
On 11/15/2012 08:45 PM, Brian Goetz wrote:
> So, there's a couple of questions here.
>
> - Linear stream chain, but multiple use. For example:
>
> Stream<T> s = ...
> T first = s.findFirst();
> T second = s.findFirst();
>
> Should this be legal? It seems harmless in this example, and the
> serial implementation is trivial, but making guarantees about what
> state a stream is left in after a terminal operation is a rats nest
> for very little value. (If you want to get at the elements in a way
> that the built-in ops can't support, the escape hatch is "iterator" or
> "spliterator".)
>
> Preference: calling a terminal op on a stream, or asking for an
> iterator/spliterator, "seals" the stream and later use of that stream
> is an error. Not clear we can enforce this at acceptable cost, so the
> fallback is "results are undefined and implementations are encourage
> to free any resources held."
I fully agree and it should be enforced by throwing a runtime exception
otherwise we will have a stream :) of bug reports.
Also there is another good reason to don't allow that. Otherwise, the
interface Stream can be used instead of Collection
so it will add burden to all API designers (should I take a Collection,
an Iterator or a Stream ?).
A Stream is a weak version of an Iterator, when you start to pull value
from it, it invalidates the whole stream chain.
BTW, I'm still not sure we should not provide a method iterator() to be
able to 'upgrade' to the iterator semantics
(for the parallel implementation iterator() will be equivalent to
sequential().iterator()).
>
> - Nonlinear stream graph. For example:
>
> Stream<T> s = ...
> Stream<T> a = s.filter(...);
> Stream<T> b = s.map(...);
> // use a and b
>
> We don't want to go out of our way to support this as it constrains
> our ability to optimize using lookahead and such in the common case.
> Again, ideally the above should probably be an error, but not clear if
> we can enforce that perfectly/economically.
It should be enforced calling a terminal method should seal the whole
stream chain and throw an exception
if a stream of the chain is already sealed.
From the implementation perspective, when you have for an operation to
a stream, you should seal it,
so stream are mutable but ops are not. Its seems doable (from my planet :)
cheers,
Rémi
>
> On 11/15/2012 2:16 PM, Remi Forax wrote:
>> This sprang up on the lambda-dev list and Brian ask me to transfer it to
>> the EG list,
>> given that the last messages only implied people of this EG, I have
>> copy/pasted these messages.
>>
>> The problem is how to deal with stream that are created from IO
>> objects ?
>> Should the implementation throws a runtime exception if such a stream
>> are iterated twice
>> by a forEach by example ?
>>
>> Rémi
>>
>>
>> Brian wrote:
>> > A related question is what should happen in this case:
>> >
>> > T first = stream.findFirst();
>> > T second = stream.findFirst();
>> >
>> >This "accidentally" works in the current serial impl, but is in
>> general a nightmare. Terminal ops should probably "close" the stream.
>>
>>
>> Sam wrote:
>>
>>> This was my thinking when I read the example. Not sure if that is
>>> practical but it might reduce errors such as the one described.
>>>
>>> Sam
>>
>> On Nov 15, 2012, at 9:36 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>>
>>> On 11/15/2012 06:22 PM, Brian Goetz wrote:
>>>> The best way to think about it is that a Stream is more like an
>>>> Iterator
>>>> than a data structure. There is some abstract source of data
>>>> somewhere
>>>> (it might be in a data structure, or might be generated from a
>>>> function,
>>>> or read from a network), and a series of transformations applied to
>>>> the
>>>> data between the source and the consumer. Streams can additionally
>>>> execute using parallelism, if requested.
>>>>
>>>> Stream constructs like:
>>>>
>>>> Stream<Person> s = people.stream()
>>>> .filter(p ->
>>>> p.getLastName().equals("Smith")))
>>>>
>>>> do not do any filtering on construction. It simply says "there's a
>>>> stream source, the collection 'people', and when you consume from the
>>>> stream s, you'll get the results of filtering the source values."
>>>>
>>>> The confusion in Dmitry's example is akin to multiple activities
>>>> reading
>>>> from the same IO channel -- they might interfere with each other over
>>>> who gets the next value, and any buffering that any consumer does may
>>>> confuse other consumers.
>>>
>>> Maybe the implementation should protect users to use two aliases of a
>>> non-replayable stream.
>>> Using the example of Dmitry, if the stream is an IO channel, the second
>>> call to limit() or to any method of 's' should throw an
>>> IllegalStateException.
>>>
>>> Rémi
>>>
>>
More information about the lambda-libs-spec-observers
mailing list