flatMap() concept

Brian Goetz brian.goetz at oracle.com
Wed Nov 14 10:25:27 PST 2012


Valid observations, but I think you're focusing on an uncommon case of 
an uncommon case.

A key design assumption is that *most* stream operations will operate in 
"push" mode rather than "pull" mode.  Pull mode (using iterators) only 
comes into play when you have an operation that may not consume all the 
input.  Most terminal operations do consume all the input (forEach, 
into, reduce, groupBy); only a few do not (findFirst, findAny) and 
therefore can use the more efficient push mode.  Similarly, asking 
explicitly for an iterator() is an "escape hatch" for when you cannot 
accomplish your goals with the built-in functionality.  (That's the 
first level of "uncommon case"; the second level is that you are mapping 
one element to many.  This happens sometimes, but it happens plenty 
frequently that one element maps to zero or one.)

The current design of flatMap is optimized for pushing; it creates no 
intermediate arrays, collections, iterator, or other bookkeeping 
overhead for the per-input-element results, and if an element maps to 
nothing, rather than creating an empty descriptor and iterating it, the 
implementation simply does nothing.  Much more efficient.  So while 
there may be a "waste of resources" in some cases, the current design is 
far more resource-efficient in the common cases.





On 11/14/2012 12:11 PM, Georgiy Rakov wrote:
> Hello,
>
> I have some concerns regarding current FlatMapper concept.
>
> Let's consider stream *s* of strings: {"a3", "b1000"}. And FlatMapper
> instance *fm* which returns following sequences on each item:
> - on "a3" it returns integers 1, 2, 3;
> - on "b1000" it returns integers 1, 2,...,1000;
>
> Suppose we have following situations while iterating over elements using
> iterator returned by *s.flatMap(fm).iterator()*.
> 1. We've iterated over first 3 items and then make *hasNext()* call and
> quit iteration because we need no more elements.
> In this case when *hasNext()* is called all 1000 elements would be
> supplied to *sink* passed to *fm* though there no needs for it.
>
> 2. We've iterated over first 4 items and then quit iteration because we
> need no more elements.
> In this case computational and memory resources will be engaged for
> handling 999 elements which we have no needs for.
>
> It seems that all these are desired to be avoided.
>
> I'd like to note that previous version of *flatMap* receiving *Mapper<T,
> Iterable<U>>* would allow us to avoid such potential waste of resources.
>
> Could you please give your comments.
>
> Thanks,
> Georgiy.
>


More information about the lambda-dev mailing list