FlatMap

Kin-man Chung kinman.chung at oracle.com
Mon Oct 15 09:37:04 PDT 2012


On 10/14/12 09:19, Paul Sandoz wrote:
> On Oct 14, 2012, at 9:39 AM, Paul Sandoz<Paul.Sandoz at oracle.com>  wrote:
>
>    
>> Hi,
>>
>> This particular case could be viewed as the following if the end result should be a list of stuff:
>>
>>   customers.stream().map(c ->  c.getOrders()).into(...)
>>
>> The Destination instance passed to "into" flattens things.
>>
>>      
> Or another way to think of this is as a fold with the same reducing and combining functions:
>
>
>      public static class CollectionCombiner<T, U extends Collection<T>>  implements BinaryOperator<U>   {
>
>          private static CollectionCombiner INSTANCE = new CollectionCombiner();
>
>          @Override
>          public U operate(U left, U right) {
>              left.addAll(right);
>              return left;
>          }
>      }
>
>      public static<T, U extends Collection<? extends T>>  BinaryOperator<U>  combiner() {
>          return (BinaryOperator<U>) CollectionCombiner.INSTANCE;
>      }
>
>      customers.stream().map(c ->  c.getOrders()).reduce(new ArrayList<>(), combiner());
>
>    
This is no simpler than my original expression

   customers.stream().flatMap((s,c) ->c.getOrders().forEach(o->s.apply(o))

The bad thing about using such expressions, as far as APIs are concern, 
is that it forces the users to use constructs that really are 
implementation details.  I think flattening Collection of Collections is 
a very common use case and  we should have an API for it.  It is all 
other languages, such as C# and Scala offer.

More below.
> "reduce" is a somewhat confusing name in this regard.
>
> Perhaps something like CollectionCombiner is useful functionality to add to the API.
>
> Another way to think of this is as Collection<T>  view over a Collection<Collection<T>>, which works best if we assume the collections are unmodifiable, which could work with the fold operation, if returning the view implementation type is acceptable.
>
> In conclusion, when one has a "collection lying around" there are potentially a number of approaches depending on the use-case specifics, which i think indicates such a flatmap convenience method may not be required.
>
> Paul.
>
>    
>> Paul.
>>
>> On Oct 13, 2012, at 7:13 PM, Brian Goetz<brian.goetz at Oracle.COM>  wrote:
>>
>>      
>>> Thanks for the suggestion.  This is something we've considered and is still on the "being considered" list.
>>>
>>> I think your suggestion is based on the assumption that you *already* have a collection lying around, and just want to return that.  While that is the case often, it is also often not.  In the case where you do not have a collection, replacing the existing FlatMap with something like you suggest would be both painful for the user (who has to create a garbage collection in the lambda) and also less efficient.  So we could not *replace* this flatMap with one taking Mapper<T, Collection<T>>  without making the API worse.
>>>        
I'm not suggesting that we replace the current API, but that we add an 
API , either by overloading flatMap, or introducing another function 
(I'll defer that decision to the experts!).
>>> We could consider adding a convenience method
>>>
>>>   flatMap(Mapper<T, Collection<T>>)
>>>
>>> for the case you describe, but we have to be careful as we are bumping our heads up against erasure.  We could only have one such method (since the erasure is flatMap(Mapper)).  We would have to pick the signature carefully.  Mapper<T,Collection>?  Mapper<T,T[]>? Mapper<T,Streamable<T>>?  If we pick Streamable (which is nice because that subsumes collections), array users are hosed -- we can't later add an array version.
>>>
>>>        
Isn't an array Streamable already?  Even if it is not, it can be easily 
turned into one, right?

Note that flattening a collection of collections can be implemented very 
efficiently.  One simply needs to traverse the tree and yields one 
element at a time, when it is needed.  I can imagine having a TreeStream 
(that implements Stream) for doing this.

Kin-man
>>>
>>> On Oct 12, 2012, at 9:32 PM, Kin-man Chung wrote:
>>>
>>>        
>>>> The use of the operation flatMap exposes the Block interface.  For
>>>> instance, if one wants to get the list of all orders from all customers,
>>>> one writes
>>>>
>>>>    customers.stream().flatMap((s,c) ->
>>>> c.getOrders().forEach(o->s.apply(o))
>>>>
>>>> In this case, the purpose of the parameter s is for buffering and
>>>> streaming the elements and can really be hidden from the user.  ( BTW,
>>>> if we have "yield", we can write (c->c.getOrders().forEach(o->yield o)),
>>>> but that's for another discussion.)
>>>>
>>>> I admit that the current flatMap is very general and powerful, but I
>>>> think it is more common to flatten elements that are Streamable.  Is it
>>>> possible to add another flavor of flaMap that takes a function that
>>>> returns a Streamable?  If so, the above example can be simplified to
>>>>
>>>>    customers.stream().flatMap(c ->  c.getOrders())
>>>>
>>>> which is much more readable.
>>>>
>>>>          
>>>
>>>        
>>
>>      
>
>    



More information about the lambda-dev mailing list