flatMap() concept
Brian Goetz
brian.goetz at oracle.com
Wed Nov 14 10:25:27 PST 2012
Valid observations, but I think you're focusing on an uncommon case of
an uncommon case.
A key design assumption is that *most* stream operations will operate in
"push" mode rather than "pull" mode. Pull mode (using iterators) only
comes into play when you have an operation that may not consume all the
input. Most terminal operations do consume all the input (forEach,
into, reduce, groupBy); only a few do not (findFirst, findAny) and
therefore can use the more efficient push mode. Similarly, asking
explicitly for an iterator() is an "escape hatch" for when you cannot
accomplish your goals with the built-in functionality. (That's the
first level of "uncommon case"; the second level is that you are mapping
one element to many. This happens sometimes, but it happens plenty
frequently that one element maps to zero or one.)
The current design of flatMap is optimized for pushing; it creates no
intermediate arrays, collections, iterator, or other bookkeeping
overhead for the per-input-element results, and if an element maps to
nothing, rather than creating an empty descriptor and iterating it, the
implementation simply does nothing. Much more efficient. So while
there may be a "waste of resources" in some cases, the current design is
far more resource-efficient in the common cases.
On 11/14/2012 12:11 PM, Georgiy Rakov wrote:
> Hello,
>
> I have some concerns regarding current FlatMapper concept.
>
> Let's consider stream *s* of strings: {"a3", "b1000"}. And FlatMapper
> instance *fm* which returns following sequences on each item:
> - on "a3" it returns integers 1, 2, 3;
> - on "b1000" it returns integers 1, 2,...,1000;
>
> Suppose we have following situations while iterating over elements using
> iterator returned by *s.flatMap(fm).iterator()*.
> 1. We've iterated over first 3 items and then make *hasNext()* call and
> quit iteration because we need no more elements.
> In this case when *hasNext()* is called all 1000 elements would be
> supplied to *sink* passed to *fm* though there no needs for it.
>
> 2. We've iterated over first 4 items and then quit iteration because we
> need no more elements.
> In this case computational and memory resources will be engaged for
> handling 999 elements which we have no needs for.
>
> It seems that all these are desired to be avoided.
>
> I'd like to note that previous version of *flatMap* receiving *Mapper<T,
> Iterable<U>>* would allow us to avoid such potential waste of resources.
>
> Could you please give your comments.
>
> Thanks,
> Georgiy.
>
More information about the lambda-dev
mailing list