Whither FlatMapper?
Ali Lahijani
alahijani at gmail.com
Sun Apr 7 18:24:53 PDT 2013
On Mon, Apr 8, 2013 at 3:17 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> I started to work through the survey comments on FlatMapper, which
> amounted to "hate the name", "need more examples", "hard to understand."
> As I started to write more examples, and consider some of the things that
> have changed in the implementation recently, I am starting to think that
> maybe now we *can* actually get away with only the "obvious" (but still
> less performant) form.
>
> What people think they want is:
>
> flatMap(T -> Stream<U>)
>
> And, in a perfect world, they would be right. The reason this has
> historically been a bad fit is that the performance cost of this version
> over the "explicit" version was enormous. (It was merely bad for the "I
> already have a collection lying around" case, but horrible for the "I am
> generating values" case.)
>
> But, a lot has happened recently in the implementation. Previously, each
> *iteration* would have generated a Spliterator, a Supplier<Spliterator>, a
> Pipeline, a PipelineHelper, and a ForEachTask -- just to pass the values
> down the stream. Since then, the supplier and helper are gone, the
> spliterator can likely be merged with the pipeline, and the forEach
> eliminated in most cases. And there is still quite a bit more running room
> to further decrease the cost of building small streams. There's a dozen
> small things we can do -- many implementation-only, but some are small API
> additions (such as singletonStream(T)) -- to bring this cost down further.
>
> Even with the general forms available, almost no one understands how they
> work, and even those who figure it out still can't figure out why they
> would want it. The pretty version is just so attractive that no one is
> willing to believe that it is painfully slow compared to the ugly version.
> Given that this adds seven new SAMs (a significant fraction of the public
> API surface area of java.util.stream), I'm having second thoughts on
> including these now.
>
> So, concrete proposal:
> - Drop all FlatMapper.* SAMs;
> - Drop all forms of flatMap(FlatMapper*)
> - Add back flatMapToXxx(Function<T, XxxStream) to Stream
>
>
Or you can keep both forms, and do it in an elegant way.
Define an abstraction, Generator<E>, for anything that supports forEach():
interface Generator<E> {
void forEach(Consumer<E> c);
}
Trivially, all Collections and Streams are Generators. But you can also
define Generators in push mode:
Generator<Integer> g = (c) -> {
for (int i = 0; i < 100; 1++) {
c.accept(i);
}
}
Now:
- Add flatMapToXxx(Function<T, XxxGenerator>) to Stream
The body of the function can return a Collection,
s.flatMapToInt(e -> Arrays.asList(e, e+1, e+2));
or a Stream,
s.flatMapToInt(e -> Arrays.asList(e, e+1, e+2).stream());
But since Generator is itself a functional interface, advanced users like
Sam can return a Generator:
s.flatMapToInt(e -> c -> {
c.accept(e);
c.accept(e + 1);
c.accept(e + 2);
})
which might be more efficient.
Of course you can
- Drop all FlatMapper.* SAMs;
- Drop all forms of flatMap(FlatMapper*)
Best
More information about the lambda-libs-spec-observers
mailing list