The stream abstraction and substream()

Wed Apr 3 01:01:20 PDT 2013

On Apr 3, 2013, at 8:07 AM, Jose <jgetino at telefonica.net> wrote:

> 
> 
> When comparing the present Stream<T> abstraction with other similar
> abstractions I'm using in my code, 
> mainly byte streams for reading/writing binary files, I miss the most common
> operations I use: doing limited reads of the stream. 
> For example a can read a byte, a short, a byte array with the following n
> elements and so on.
> 
> For Stream<T> and restricting to n=1 to keep it simple, the equivalent
> operation would be: 
> 
>                     T  next()  
> 
> But for Streams supporting concatenation, a very clever design, the reading
> operations has to return a Stream, so in the pipeline context 
> the operation should be in fact:
> 
>                Stream<T> next(Consumer<T> c)
> 
> 
> Reading the Stream documentation I see that Stream<T> supports a kind of
> "skip" operation:
> 
> 
>                      Stream<T> substream(long startingOffset)
> 
> 
> Which is described as  "producing a Stream consistent of the elements of
> this stream, discarding the first startingOffset elements" 
> If you are not going to read these elements, I would prefer the name skip(n)
> more than substream(n), because substream says little in the pipeline
> context, 
> almost everything produces a stream, and a filter can also be seen as
> producing a kind of substream.
> 
> The problem with this operation is that it really skips n elemets doing
> nothing with them. As you can't process them, you can't emulate a next()
> operation for example.
> 
> So my question is, why not implement a more general operation instead (maybe
> using other name)
> 
>                      Stream<T> substream(long n, Consumer<T>)
> 
> and give the user the chance of processing the first n elements of the
> stream?. 
> 
> Or I missing someting important here? 
> 

Yes, parallel evaluation. What order should elements be reported to the consumer? When a stream is evaluated in parallel the functional interfaces passed to operations may be called concurrent and in any order.

You can convert the stream into an iterator, which represents a terminal operation. Then you can call hasNext()/next().

What i think you are asking for is a way to split the stream into two. It is possible, but there are some awkward edge cases (e.g. when size is not known).

Here is how you can do it: get the spliterator from a stream, consume up to n elements from the spltierator and buffer them in an array list, create a new stream from that buffer, create another new stream from that spliterator.

Paul.