The stream abstraction and substream()
Paul Sandoz
paul.sandoz at oracle.com
Wed Apr 3 01:01:20 PDT 2013
On Apr 3, 2013, at 8:07 AM, Jose <jgetino at telefonica.net> wrote:
>
>
> When comparing the present Stream<T> abstraction with other similar
> abstractions I'm using in my code,
> mainly byte streams for reading/writing binary files, I miss the most common
> operations I use: doing limited reads of the stream.
> For example a can read a byte, a short, a byte array with the following n
> elements and so on.
>
> For Stream<T> and restricting to n=1 to keep it simple, the equivalent
> operation would be:
>
> T next()
>
> But for Streams supporting concatenation, a very clever design, the reading
> operations has to return a Stream, so in the pipeline context
> the operation should be in fact:
>
> Stream<T> next(Consumer<T> c)
>
>
> Reading the Stream documentation I see that Stream<T> supports a kind of
> "skip" operation:
>
>
> Stream<T> substream(long startingOffset)
>
>
> Which is described as "producing a Stream consistent of the elements of
> this stream, discarding the first startingOffset elements"
> If you are not going to read these elements, I would prefer the name skip(n)
> more than substream(n), because substream says little in the pipeline
> context,
> almost everything produces a stream, and a filter can also be seen as
> producing a kind of substream.
>
> The problem with this operation is that it really skips n elemets doing
> nothing with them. As you can't process them, you can't emulate a next()
> operation for example.
>
> So my question is, why not implement a more general operation instead (maybe
> using other name)
>
> Stream<T> substream(long n, Consumer<T>)
>
> and give the user the chance of processing the first n elements of the
> stream?.
>
> Or I missing someting important here?
>
Yes, parallel evaluation. What order should elements be reported to the consumer? When a stream is evaluated in parallel the functional interfaces passed to operations may be called concurrent and in any order.
You can convert the stream into an iterator, which represents a terminal operation. Then you can call hasNext()/next().
What i think you are asking for is a way to split the stream into two. It is possible, but there are some awkward edge cases (e.g. when size is not known).
Here is how you can do it: get the spliterator from a stream, consume up to n elements from the spltierator and buffer them in an array list, create a new stream from that buffer, create another new stream from that spliterator.
Paul.
More information about the lambda-dev
mailing list