flatMap (was: Updated Java 8 Stream examples in GS Collections Kata)
Howard Lovatt
howard.lovatt at gmail.com
Tue Nov 13 10:11:54 PST 2012
Comments in line below
Sent from my iPad
On 13/11/2012, at 3:42 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>> The things I found awkward using in the kata were flatMap
>
> This is a complaint we received over and over again in the "Hack Day" sessions -- it is pretty clear we are not there yet on flatMap.
>
> 1. It is not obvious flatMap is the best name, as it sets expectations for Scala users that will not be met. Perhaps mapMulti? explode?
>
> 2. The API form is definitely not there yet. But, the "obvious" suggestion is clearly wrong. Assume for sake of argument that there will be only one version of flatMap. What everyone thinks it should look like (after a few seconds of thought) is:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
In my own library I used the equivalent of this signature, but I provided a collection to return for the case when you don't have one already, i.e. nearly always. This default collection is via a method in my equivalent of a FlatMapper interface, i.e. something like:
interface FlatMapper<I, O> extends Mapper<I, Collection<O>> {
Collection<O> flatMap(I i);
@SuppressWarnings("unchecked")
default Collection<O> defaultCollection() { return (Collection<O>)Trait.defaultCollection.get(); }
class Trait {
static final ThreadLocal<Collection> defaultCollection = new ThreadLocal<>();
}
}
defaultCollection is set before calling flatMap to the internal data structure of the stream in question; which is an AppendOnlyCollection from an external point of view.
> But, this is awful (except when you already happen to have a Collection lying around, which will be sometimes but by no means all of the time.) This forces (a) the client to create and populate a Collection in the lambda (yuck!), usually an extra collection to be created even when the element maps to nothing, and then forces an iteration (with Iterator) on the other side. So, in the case where there's no collection lying around, the client code is larger, uglier, and in any case the whole thing is significantly slower. The current version has a very nice property: when the element maps to nothing, there's no work to do.
>
> This is further challenging because of erasure. If we had:
>
> Stream<T> flatMap(Mapper<T, Collection<T>>)
>
> we might also want
>
> Stream<T> flatMap(Mapper<T, T[]>)
> or
> Stream<T> flatMap(Mapper<T, Stream<T>>)
> or
> Stream<T> flatMap(Mapper<T, Streamable<T>>)
>
> But, we can't overload these since their erasure is all "Mapper".
You could use:
FlatMapper.toArray, FlatMapper.toStream, and FlatMapper.toStreamable to give none erased names to the types.
More information about the lambda-libs-spec-observers
mailing list