flatMap
Sam Pullara
spullara at gmail.com
Wed Nov 14 14:57:04 PST 2012
My last 2 cents of caring on this:
Just had a long conversation with Brian about the motivation for the current implementation and I understand it a lot better now. So the current thing in the lambda libraries has a shape more like this:
<R> Stream<R> flatMap(FlatMapper<? extends R, ? super T> mapper);
where FlatMapper is:
void flatMapInto(Block<? super R> sink, T element);
rather than the expected:
<R> Stream<R> flatMap(Mapper<T, Stream<R>> mapper);
The reason is that in many cases we won't have a stream available on the mapping side and will end up in a situation where we need to wrap each map, we always have to do work no matter what, etc. Especially since we don't have a way to easily yield a value and consume them from the outside. My suggestion is to change the the current flatMap's name to flatMapInto and implement flatMap as expected.
There are cases where the normal definition of flatMap will be more efficient (more so over time as people build APIs as streams where concatenation is cheap) and it will be the expected shape of the method for many.
Sam
On Nov 13, 2012, at 10:41 AM, Joe Bowbeer <joe.bowbeer at gmail.com> wrote:
> A few notes related to analogous features:
>
> I suspect that most Java programmers using these new features will be more familiar with similar features in Groovy than any other language.
>
> Groovy has collect{} and flatten() but no flatCollect{} shorthand for collect{}.flatten().
>
> One form of flatten takes a closure, by the way:
>
> http://groovy.codehaus.org/groovy-jdk/java/util/Collection.html#flatten(groovy.lang.Closure)
>
> FWIW, here are examples of list flattening in many languages:
>
> http://rosettacode.org/wiki/Flatten_a_list
>
> --Joe
>
>
> On Tue, Nov 13, 2012 at 10:07 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> That's the obvious answer...like I said, it's just not the right answer here. (It can be a convenience layer built atop of a more flexible primitive, though.) Try writing something like "break a string into words" using the flatMap(Mapper<T,Stream<T>>), or more generally anything that naturally yields up elements one at a time.) Also the Mapper<T,Stream<T>> is going to be a lot less efficient. Like I said, it can be part of the answer, but it is absolutely the wrong primitive.
>
> Java isn't Scala. What works in Scala libraries may well not work in Java libraries, because the language is different. So the "right" answer in one language is not necessarily the right answer in another. (Related example: if you have tuples, a method like "partition(Predicate)" is more natural than if you don't. So the partition method is a no-brainer in Scala but entails far more tradeoffs in Java, and might be on the other side of the line for that reason.)
>
>
>
> On 11/13/2012 12:33 PM, Sam Pullara wrote:
> Honestly I would have expected only a single shape for flatMap:
>
> Stream<V> flatMap(Mapper<T, Stream<V>>)
>
> Where the result is a concatenation of the Streams returned. Kind of
> like doing a map from T to Stream<V> ending up with a Stream<Stream<V>>
> and then calling flatten() on that Stream. Is this not the normal
> definition? Here is the definition from Scala:
>
>
> defflatMap[B](f: (A) ⇒ GenTraversableOnce
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/GenTraversableOnce.html>[B]):
> Traversable
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/Traversable.html>[B]
>
> [use case]
> Builds a new collection by applying a function to all elements of this
> traversable collection and using the elements of the resulting collections.
> For example:
>
> def getWords(lines:Seq[String]):Seq[String] = lines flatMap (line=> line split"\\W+ <smb://W+>")
>
> The type of the resulting collection is guided by the static type of
> traversable collection. This might cause unexpected results sometimes.
> For example:
>
> // lettersOf will return a Seq[Char] of likely repeated letters, instead of a Set
> def lettersOf(words:Seq[String]) = words flatMap (word=> word.toSet)
>
> // lettersOf will return a Set[Char], not a Seq
> def lettersOf(words:Seq[String]) = words.toSet flatMap (word=> word.toSeq)
>
> // xs will be a an Iterable[Int]
> val xs =Map("a" ->List(11,111),"b" ->List(22,222)).flatMap(_._2)
>
> // ys will be a Map[Int, Int]
> val ys =Map("a" ->List(1 ->11,1 ->111),"b" ->List(2 ->22,2 ->222)).flatMap(_._2)
>
> B
> the element type of the returned collection.
> f
> the function to apply to each element.
> returns
> a new traversable collection resulting from applying the given
> collection-valued function |f| to each element of this traversable
> collection and concatenating the results.
>
> Definition Classes
> TraversableLike
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/TraversableLike.html> →
> GenTraversableLike
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/GenTraversableLike.html> →
> FilterMonadic
> <http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/generic/FilterMonadic.html>
>
>
>
> Sam
>
> On Nov 13, 2012, at 7:42 AM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
> The things I found awkward using in the kata were flatMap
>
> This is a complaint we received over and over again in the "Hack Day"
> sessions -- it is pretty clear we are not there yet on flatMap.
>
> 1. It is not obvious flatMap is the best name, as it sets
> expectations for Scala users that will not be met. Perhaps mapMulti?
> explode?
>
> 2. The API form is definitely not there yet. But, the "obvious"
> suggestion is clearly wrong. Assume for sake of argument that there
> will be only one version of flatMap. What everyone thinks it should
> look like (after a few seconds of thought) is:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
>
> But, this is awful (except when you already happen to have a
> Collection lying around, which will be sometimes but by no means all
> of the time.) This forces (a) the client to create and populate a
> Collection in the lambda (yuck!), usually an extra collection to be
> created even when the element maps to nothing, and then forces an
> iteration (with Iterator) on the other side. So, in the case where
> there's no collection lying around, the client code is larger, uglier,
> and in any case the whole thing is significantly slower. The current
> version has a very nice property: when the element maps to nothing,
> there's no work to do.
>
> This is further challenging because of erasure. If we had:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
>
> we might also want
>
> Stream<T> flatMap(Mapper<T, T[]>)
> or
> Stream<T> flatMap(Mapper<T, Stream<T>>)
> or
> Stream<T> flatMap(Mapper<T, Streamable<T>>)
>
> But, we can't overload these since their erasure is all "Mapper".
>
>
>
>
More information about the lambda-libs-spec-observers
mailing list