flatMap
Brian Goetz
brian.goetz at oracle.com
Fri Nov 16 09:33:32 PST 2012
Yeah, I have no problem with the "flatCollect" idiom as a convenience.
I just didn't want it to be the "primitive" version flatMap, and because
there are interactions with erasure (can't overload
flatCollect(Function<T,Iterable<T>>) with flatCollect(Function<T,T[]>)
or Stream<T> or...), we have to choose wisely.
Not sure about the name flatCollect when there is no "collect"; it seems
that flatCollect modifies "collect" in some way?
On 11/16/2012 12:27 PM, Raab, Donald wrote:
> This is our signature in GS Collections:
>
> <V> RichIterable<V> flatCollect(Function<? super T, ? extends
> Iterable<V>> function);
>
> This is our lazy (stream) implementation:
>
> https://github.com/goldmansachs/gs-collections/blob/master/collections/src/main/java/com/gs/collections/impl/lazy/AbstractLazyIterable.java
>
> https://github.com/goldmansachs/gs-collections/blob/master/collections/src/main/java/com/gs/collections/impl/lazy/FlatCollectIterable.java
>
> At the end, FlatCollectIterable (or any other lazy iterble) just
> delegates to forEach(), which is exactly what I am seeing all of the
> duplicate hand coded examples doing. Am I missing something? Can’t we
> just use Iterable instead of Stream or Collection and just rely on
> forEach()?
>
> *From:*lambda-libs-spec-experts-bounces at openjdk.java.net
> [mailto:lambda-libs-spec-experts-bounces at openjdk.java.net] *On Behalf Of
> *Sam Pullara
> *Sent:* Wednesday, November 14, 2012 5:57 PM
> *To:* Joe Bowbeer
> *Cc:* lambda-libs-spec-experts at openjdk.java.net
> *Subject:* Re: flatMap
>
> My last 2 cents of caring on this:
>
> Just had a long conversation with Brian about the motivation for the
> current implementation and I understand it a lot better now. So the
> current thing in the lambda libraries has a shape more like this:
>
> <R> Stream<R> flatMap(FlatMapper<? extends R, ? super T> mapper);
>
> where FlatMapper is:
>
> void flatMapInto(Block<? super R> sink, T element);
>
> rather than the expected:
>
> <R> Stream<R> flatMap(Mapper<T, Stream<R>> mapper);
>
> The reason is that in many cases we won't have a stream available on
> the mapping side and will end up in a situation where we need to wrap
> each map, we always have to do work no matter what, etc. Especially
> since we don't have a way to easily yield a value and consume them from
> the outside. My suggestion is to change the the current flatMap's name
> to flatMapInto and implement flatMap as expected.
>
> There are cases where the normal definition of flatMap will be more
> efficient (more so over time as people build APIs as streams where
> concatenation is cheap) and it will be the expected shape of the method
> for many.
>
> Sam
>
> On Nov 13, 2012, at 10:41 AM, Joe Bowbeer <joe.bowbeer at gmail.com
> <mailto:joe.bowbeer at gmail.com>> wrote:
>
>
>
> A few notes related to analogous features:
>
> I suspect that most Java programmers using these new features will be
> more familiar with similar features in Groovy than any other language.
>
> Groovy has collect{} and flatten() but no flatCollect{} shorthand for
> collect{}.flatten().
>
> One form of flatten takes a closure, by the way:
>
> http://groovy.codehaus.org/groovy-jdk/java/util/Collection.html#flatten(groovy.lang.Closure)
>
> FWIW, here are examples of list flattening in many languages:
>
> http://rosettacode.org/wiki/Flatten_a_list
>
> --Joe
>
> On Tue, Nov 13, 2012 at 10:07 AM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
> That's the obvious answer...like I said, it's just not the right answer
> here. (It can be a convenience layer built atop of a more flexible
> primitive, though.) Try writing something like "break a string into
> words" using the flatMap(Mapper<T,Stream<T>>), or more generally
> anything that naturally yields up elements one at a time.) Also the
> Mapper<T,Stream<T>> is going to be a lot less efficient. Like I said,
> it can be part of the answer, but it is absolutely the wrong primitive.
>
> Java isn't Scala. What works in Scala libraries may well not work in
> Java libraries, because the language is different. So the "right"
> answer in one language is not necessarily the right answer in another.
> (Related example: if you have tuples, a method like
> "partition(Predicate)" is more natural than if you don't. So the
> partition method is a no-brainer in Scala but entails far more tradeoffs
> in Java, and might be on the other side of the line for that reason.)
>
>
>
> On 11/13/2012 12:33 PM, Sam Pullara wrote:
>
> Honestly I would have expected only a single shape for flatMap:
>
> Stream<V> flatMap(Mapper<T, Stream<V>>)
>
>
> Where the result is a concatenation of the Streams returned. Kind of
> like doing a map from T to Stream<V> ending up with a Stream<Stream<V>>
> and then calling flatten() on that Stream. Is this not the normal
> definition? Here is the definition from Scala:
>
>
> defflatMap[B](f: (A) ⇒ GenTraversableOnce
>
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/GenTraversableOnce.html>[B]):
> Traversable
>
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/Traversable.html>[B]
>
> [use case]
> Builds a new collection by applying a function to all elements of this
> traversable collection and using the elements of the resulting collections.
> For example:
>
> def getWords(lines:Seq[String]):Seq[String] = lines flatMap (line=>
> line split"\\W+ <smb://W+> <smb://W+>")
>
> The type of the resulting collection is guided by the static type of
> traversable collection. This might cause unexpected results sometimes.
> For example:
>
> // lettersOf will return a Seq[Char] of likely repeated letters, instead
> of a Set
> def lettersOf(words:Seq[String]) = words flatMap (word=> word.toSet)
>
> // lettersOf will return a Set[Char], not a Seq
> def lettersOf(words:Seq[String]) = words.toSet flatMap (word=> word.toSeq)
>
> // xs will be a an Iterable[Int]
> val xs =Map("a" ->List(11,111),"b" ->List(22,222)).flatMap(_._2)
>
> // ys will be a Map[Int, Int]
> val ys =Map("a" ->List(1 ->11,1 ->111),"b" ->List(2 ->22,2
> ->222)).flatMap(_._2)
>
> B
> the element type of the returned collection.
> f
> the function to apply to each element.
> returns
> a new traversable collection resulting from applying the given
> collection-valued function |f| to each element of this traversable
> collection and concatenating the results.
>
> Definition Classes
> TraversableLike
>
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/TraversableLike.html>
> →
> GenTraversableLike
>
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/GenTraversableLike.html>
> →
> FilterMonadic
>
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/generic/FilterMonadic.html>
>
>
>
> Sam
>
> On Nov 13, 2012, at 7:42 AM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>
> <mailto:brian.goetz at oracle.com <mailto:brian.goetz at oracle.com>>> wrote:
>
> The things I found awkward using in the kata were flatMap
>
>
> This is a complaint we received over and over again in the "Hack Day"
> sessions -- it is pretty clear we are not there yet on flatMap.
>
> 1. It is not obvious flatMap is the best name, as it sets
> expectations for Scala users that will not be met. Perhaps mapMulti?
> explode?
>
> 2. The API form is definitely not there yet. But, the "obvious"
> suggestion is clearly wrong. Assume for sake of argument that there
> will be only one version of flatMap. What everyone thinks it should
> look like (after a few seconds of thought) is:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
>
> But, this is awful (except when you already happen to have a
> Collection lying around, which will be sometimes but by no means all
> of the time.) This forces (a) the client to create and populate a
> Collection in the lambda (yuck!), usually an extra collection to be
> created even when the element maps to nothing, and then forces an
> iteration (with Iterator) on the other side. So, in the case where
> there's no collection lying around, the client code is larger, uglier,
> and in any case the whole thing is significantly slower. The current
> version has a very nice property: when the element maps to nothing,
> there's no work to do.
>
> This is further challenging because of erasure. If we had:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
>
> we might also want
>
> Stream<T> flatMap(Mapper<T, T[]>)
> or
> Stream<T> flatMap(Mapper<T, Stream<T>>)
> or
> Stream<T> flatMap(Mapper<T, Streamable<T>>)
>
> But, we can't overload these since their erasure is all "Mapper".
>
More information about the lambda-libs-spec-observers
mailing list