Suggestion: buffer(int size) as the bulk operation

Paul Sandoz paul.sandoz at oracle.com
Tue Sep 18 05:33:43 PDT 2012


On Sep 13, 2012, at 12:45 PM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
Hi Aleksey,

> Hi,
> 
> Is there a merit for including semi-eager buffering stage in streams?
> This will arguably help in some of the corner cases when the stream
> generator, or even the ops have the temporal/spatial locality benefits.
> Normally we can force users to have eager terminators to compute
> everything at once. buffer() acts like a middle ground here, providing
> the means to balance the laziness vs. locality benefits.
> 
> I.e.:
> 
> list.stream().map(longOperation).buffer(10).iterator()
> 
> We have both the benefits of staying within lazy operations, and having
> the opportunity to process the operations in bulk, without consuming all
> the stream.
> 

So this buffer operation, when empty, will eagerly pull 10 elements from upstream and place them in it's own managed array (or what ever is appropriate).

Would that be any different from wrapping the iterator in a BufferedIterator ?

Is it a good idea for the developer to select a size, will they ever select a good size?


> BTW, this suggestion really shines if we can then have limited push
> traversal in the framework. That means, forEach which would dump not the
> entire stream contents, but only the next $limit elements. In this
> example we can then do limited forEach(10) pushes to buffer, and the let
> buffer provide us "lazy" pulls, doing bulk forEach(10) each time it
> drains out.
> 

Not sure i quite understand, can you provide a more explicit example? 

How would one synchronise the push to pull?

Paul.


More information about the lambda-dev mailing list