peek().iterator().hasNext() pre-consumes elements?

Thu Feb 21 10:52:11 PST 2013

Good point.  I'll have to tighten up the spec to allow for this; there's 
no way we can prevent hasNext() from "peeking" ahead, nor can we delay 
the side-effects until the caller actually calls next().

On (2), you are right that the upstream operation is s1.  Again, I think 
the existing spec is simply sloppy (though having spec at all is an 
improvement!) and needs to be tightened.

On 2/21/2013 1:28 PM, Georgiy Rakov wrote:
> If it's s2 then two following questions arise:
>
> 1. The behavior originally described by Dmitry, i. e. when following
> code prints 1, could be considered to be faulty because calling
> hasNext() doesn't mean /consuming/ I believe.
>
>        Arrays.asList(1, 2,
> 3).stream().peek(System.err::println).iterator().hasNext()
>
> Please confirm if this is right and going to be fixed.
>
> 2. Regarding following part of spec (say *part A*):
>
>     * <p>For parallel stream pipelines, the {@code Consumer} may be
>     called at whatever time and in whatever thread
>     * the element is made available by the /_*upstream operation*_/. If
>     the {@code Consumer} modifies shared state,
>
> If I hasn't got mixed up in the terminology /_*upstream operation*_/
> means s1. So it looks like a contradiction with the first part of spec
> (say *part B*):
>
>     * Produce a {@code Stream} containing the elements of this stream,
>     and also provide elements
>     * to the specified {@link Consumer} as elements are consumed from
>     the /_*resulting stream*_/.  This is
>
> because in accordance with *part A* consumer is called when the *element
> is made available by /_s1_/*/__/while in accordance with *part B* the
> consumer is called when *the element is consumed from /_s2_/*.
> If it's really the case then such a big difference between behavior of
> parallel and sequential pipelines should be more explicitly displayed in
> spec I believe. Please provide your comments.
>
> Thank you,
> Georgiy.
>
>
> On 21.02.2013 19:57, Brian Goetz wrote:
>> It is s2.
>>
>> Once you do s2 = s1.peek(), s1 is "used up" and all access to the
>> stream data is through s2.
>>
>> On 2/21/2013 5:08 AM, Georgiy Rakov wrote:
>>> Could you please provide some more information regarding following part
>>> of this spec:
>>>
>>>       * Produce a {@code Stream} containing the elements of this stream,
>>> and also provide elements
>>>       * to the specified {@link Consumer} as elements are consumed from
>>> the /_*resulting stream*_/.  This is
>>>
>>>
>>> What is "*resulting stream*" - stream returned by peek() or the stream
>>> the peek() is applied to, i.e. considering following code - s1 or s2?
>>>
>>>     Stream s1;
>>>     ...
>>>     Stream s2 = s1.peek();
>>>
>>> Thanks,
>>> Georgiy.
>>>
>>> On 20.02.2013 20:38, Brian Goetz wrote:
>>>> Here's the current spec for this method -- does this help?
>>>>
>>>>     /**
>>>>      * Produce a {@code Stream} containing the elements of this
>>>> stream, and also provide elements
>>>>      * to the specified {@link Consumer} as elements are consumed from
>>>> the resulting stream.  This is
>>>>      * an <a href="package-summary.html#StreamOps">intermediate
>>>> operation</a>.
>>>>      * {@apiNote}
>>>>      * This method exists mainly to support debugging, where you want
>>>> to see the elements as they flow past a certain
>>>>      * point in a pipeline:
>>>>      * <pre>
>>>>      *     list.stream()
>>>>      *         .filter(filteringFunction)
>>>>      *         .peek(e -> {System.out.println("Filtered value: " + e);
>>>> });
>>>>      *         .map(mappingFunction)
>>>>      *         .peek(e -> {System.out.println("Mapped value: " + e);
>>>> });
>>>>      *         .collect(Collectors.intoList());
>>>>      * </pre>
>>>>      *
>>>>      * <p>For parallel stream pipelines, the {@code Consumer} may be
>>>> called at whatever time and in whatever thread
>>>>      * the element is made available by the upstream operation. If the
>>>> {@code Consumer} modifies shared state,
>>>>      * it is responsible for providing the required synchronization.
>>>>      *
>>>>      * @param consumer The {@code Consumer} to receive the elements
>>>>      */
>>>>
>>>>
>>>>
>>>> On 2/20/2013 10:02 AM, Georgiy Rakov wrote:
>>>>> Hello again,
>>>>>
>>>>> it has just come into my mind that it could be quite more major issue
>>>>> than I wrote in my previous letter.
>>>>>
>>>>> So the case a bit rewritten:
>>>>>
>>>>>     Stream s1 = Arrays.asList(1, 2, 3).stream();
>>>>>     Stream s2 = s1.peek(System.err::println);
>>>>>     s2.iterator().hasNext();
>>>>>
>>>>>
>>>>> The spec says:
>>>>>
>>>>>     Produce a Stream containing the elements of this stream, and also
>>>>>     provide elements to the specified Consumer as elements are *passed
>>>>>     through*.
>>>>>
>>>>> So the core question is what does "passed through" mean?From the first
>>>>> glance I would say it means *consuming elements from **stream returned
>>>>> ****by peek()* (not from stream which peek() is applied to). If this
>>>>> interpretation is right then I could suppose it's a bug because the
>>>>> element from s2 has not been consumed yet (next() is not called just
>>>>> hasNext() has been called).
>>>>>
>>>>> Could you please confirm if such reasoning is right and it's really a
>>>>> bug.
>>>>>
>>>>> Thanks, Georgiy.
>>>>>
>>>>> On 12.02.2013 23:01, Remi Forax wrote:
>>>>>> On 02/12/2013 07:16 PM, Brian Goetz wrote:
>>>>>>> The answer here is complicated, but in general, calling hasNext may
>>>>>>> well
>>>>>>> require consuming an element -- there's often no way to know
>>>>>>> whether a
>>>>>>> source would produce an element without asking it to do so. So it
>>>>>>> is a
>>>>>>> common practice in implementing iterators to do this (one of many
>>>>>>> reasons why we did not build Streams on Iterator.)
>>>>>>>
>>>>>>> Because the elements are coming from an array, it might be
>>>>>>> possible to
>>>>>>> know simply based on how many elements have gone by that the
>>>>>>> stream is
>>>>>>> not yet exhausted.  But in the general case (such as when the stream
>>>>>>> source is an IO channel), it is not possible to know without
>>>>>>> actually
>>>>>>> consuming and buffering some input.  So I would put this in the
>>>>>>> category
>>>>>>> of "acceptable" behavior.  We might someday do some work to take
>>>>>>> advantage of the fact that the source has the SIZED
>>>>>>> characteristic and
>>>>>>> the pipeline stages are size-preserving to make this case behave
>>>>>>> "better", but that would be an implementation quality issue, not a
>>>>>>> spec
>>>>>>> issue.  The behavior you observe is allowable by the spec.
>>>>>> while I a stream may have to do some buffering, peek should always be
>>>>>> transparent and an iterator on an array doesn't need any buffering
>>>>>> but I
>>>>>> agree that this is an implementation issue.
>>>>>>
>>>>>> Rémi
>>>>>>
>>>>>>> On 2/12/2013 12:53 PM, Dmitry Bessonov wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The following line prints out the first element, "1"
>>>>>>>>
>>>>>>>>       Arrays.asList(1, 2,
>>>>>>>> 3).stream().peek(System.err::println).iterator().hasNext()
>>>>>>>>
>>>>>>>> Is it really an expected behavior?
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>