peek().iterator().hasNext() pre-consumes elements?
Brian Goetz
brian.goetz at oracle.com
Thu Feb 21 10:52:11 PST 2013
Good point. I'll have to tighten up the spec to allow for this; there's
no way we can prevent hasNext() from "peeking" ahead, nor can we delay
the side-effects until the caller actually calls next().
On (2), you are right that the upstream operation is s1. Again, I think
the existing spec is simply sloppy (though having spec at all is an
improvement!) and needs to be tightened.
On 2/21/2013 1:28 PM, Georgiy Rakov wrote:
> If it's s2 then two following questions arise:
>
> 1. The behavior originally described by Dmitry, i. e. when following
> code prints 1, could be considered to be faulty because calling
> hasNext() doesn't mean /consuming/ I believe.
>
> Arrays.asList(1, 2,
> 3).stream().peek(System.err::println).iterator().hasNext()
>
> Please confirm if this is right and going to be fixed.
>
> 2. Regarding following part of spec (say *part A*):
>
> * <p>For parallel stream pipelines, the {@code Consumer} may be
> called at whatever time and in whatever thread
> * the element is made available by the /_*upstream operation*_/. If
> the {@code Consumer} modifies shared state,
>
> If I hasn't got mixed up in the terminology /_*upstream operation*_/
> means s1. So it looks like a contradiction with the first part of spec
> (say *part B*):
>
> * Produce a {@code Stream} containing the elements of this stream,
> and also provide elements
> * to the specified {@link Consumer} as elements are consumed from
> the /_*resulting stream*_/. This is
>
> because in accordance with *part A* consumer is called when the *element
> is made available by /_s1_/*/__/while in accordance with *part B* the
> consumer is called when *the element is consumed from /_s2_/*.
> If it's really the case then such a big difference between behavior of
> parallel and sequential pipelines should be more explicitly displayed in
> spec I believe. Please provide your comments.
>
> Thank you,
> Georgiy.
>
>
> On 21.02.2013 19:57, Brian Goetz wrote:
>> It is s2.
>>
>> Once you do s2 = s1.peek(), s1 is "used up" and all access to the
>> stream data is through s2.
>>
>> On 2/21/2013 5:08 AM, Georgiy Rakov wrote:
>>> Could you please provide some more information regarding following part
>>> of this spec:
>>>
>>> * Produce a {@code Stream} containing the elements of this stream,
>>> and also provide elements
>>> * to the specified {@link Consumer} as elements are consumed from
>>> the /_*resulting stream*_/. This is
>>>
>>>
>>> What is "*resulting stream*" - stream returned by peek() or the stream
>>> the peek() is applied to, i.e. considering following code - s1 or s2?
>>>
>>> Stream s1;
>>> ...
>>> Stream s2 = s1.peek();
>>>
>>> Thanks,
>>> Georgiy.
>>>
>>> On 20.02.2013 20:38, Brian Goetz wrote:
>>>> Here's the current spec for this method -- does this help?
>>>>
>>>> /**
>>>> * Produce a {@code Stream} containing the elements of this
>>>> stream, and also provide elements
>>>> * to the specified {@link Consumer} as elements are consumed from
>>>> the resulting stream. This is
>>>> * an <a href="package-summary.html#StreamOps">intermediate
>>>> operation</a>.
>>>> * {@apiNote}
>>>> * This method exists mainly to support debugging, where you want
>>>> to see the elements as they flow past a certain
>>>> * point in a pipeline:
>>>> * <pre>
>>>> * list.stream()
>>>> * .filter(filteringFunction)
>>>> * .peek(e -> {System.out.println("Filtered value: " + e);
>>>> });
>>>> * .map(mappingFunction)
>>>> * .peek(e -> {System.out.println("Mapped value: " + e);
>>>> });
>>>> * .collect(Collectors.intoList());
>>>> * </pre>
>>>> *
>>>> * <p>For parallel stream pipelines, the {@code Consumer} may be
>>>> called at whatever time and in whatever thread
>>>> * the element is made available by the upstream operation. If the
>>>> {@code Consumer} modifies shared state,
>>>> * it is responsible for providing the required synchronization.
>>>> *
>>>> * @param consumer The {@code Consumer} to receive the elements
>>>> */
>>>>
>>>>
>>>>
>>>> On 2/20/2013 10:02 AM, Georgiy Rakov wrote:
>>>>> Hello again,
>>>>>
>>>>> it has just come into my mind that it could be quite more major issue
>>>>> than I wrote in my previous letter.
>>>>>
>>>>> So the case a bit rewritten:
>>>>>
>>>>> Stream s1 = Arrays.asList(1, 2, 3).stream();
>>>>> Stream s2 = s1.peek(System.err::println);
>>>>> s2.iterator().hasNext();
>>>>>
>>>>>
>>>>> The spec says:
>>>>>
>>>>> Produce a Stream containing the elements of this stream, and also
>>>>> provide elements to the specified Consumer as elements are *passed
>>>>> through*.
>>>>>
>>>>> So the core question is what does "passed through" mean?From the first
>>>>> glance I would say it means *consuming elements from **stream returned
>>>>> ****by peek()* (not from stream which peek() is applied to). If this
>>>>> interpretation is right then I could suppose it's a bug because the
>>>>> element from s2 has not been consumed yet (next() is not called just
>>>>> hasNext() has been called).
>>>>>
>>>>> Could you please confirm if such reasoning is right and it's really a
>>>>> bug.
>>>>>
>>>>> Thanks, Georgiy.
>>>>>
>>>>> On 12.02.2013 23:01, Remi Forax wrote:
>>>>>> On 02/12/2013 07:16 PM, Brian Goetz wrote:
>>>>>>> The answer here is complicated, but in general, calling hasNext may
>>>>>>> well
>>>>>>> require consuming an element -- there's often no way to know
>>>>>>> whether a
>>>>>>> source would produce an element without asking it to do so. So it
>>>>>>> is a
>>>>>>> common practice in implementing iterators to do this (one of many
>>>>>>> reasons why we did not build Streams on Iterator.)
>>>>>>>
>>>>>>> Because the elements are coming from an array, it might be
>>>>>>> possible to
>>>>>>> know simply based on how many elements have gone by that the
>>>>>>> stream is
>>>>>>> not yet exhausted. But in the general case (such as when the stream
>>>>>>> source is an IO channel), it is not possible to know without
>>>>>>> actually
>>>>>>> consuming and buffering some input. So I would put this in the
>>>>>>> category
>>>>>>> of "acceptable" behavior. We might someday do some work to take
>>>>>>> advantage of the fact that the source has the SIZED
>>>>>>> characteristic and
>>>>>>> the pipeline stages are size-preserving to make this case behave
>>>>>>> "better", but that would be an implementation quality issue, not a
>>>>>>> spec
>>>>>>> issue. The behavior you observe is allowable by the spec.
>>>>>> while I a stream may have to do some buffering, peek should always be
>>>>>> transparent and an iterator on an array doesn't need any buffering
>>>>>> but I
>>>>>> agree that this is an implementation issue.
>>>>>>
>>>>>> Rémi
>>>>>>
>>>>>>> On 2/12/2013 12:53 PM, Dmitry Bessonov wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The following line prints out the first element, "1"
>>>>>>>>
>>>>>>>> Arrays.asList(1, 2,
>>>>>>>> 3).stream().peek(System.err::println).iterator().hasNext()
>>>>>>>>
>>>>>>>> Is it really an expected behavior?
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>
More information about the lambda-dev
mailing list