explode

Brian Goetz brian.goetz at oracle.com
Wed Feb 6 15:30:15 PST 2013


> You said that we should not use Collection explicitly in the stream API
> hence we don't have toList(), toSet(), or groupBy() but
> collect(toList()), collect(toSet()) or  collect(groupingBy)
> and at the same time, for flatMap which will be less used, you want to
> add flatMapToCollection, flatMapToArray.

Yes, any coupling to Collection is undesirable and has to be justified. 
  We're currently in a nice place (zero uses of Collection in Stream) so 
it would be nice to stay there, and one is a lot worse than zero.

But be careful that you try to turn consistency into a goal unto itself. 
  For example, the use of Collections in Collectors is an ideal 
compromise; the important thing is they are out of the core interface 
which we expect every aggregate for the next 10+ years to implement, but 
are still available for easy use through standalone static helper 
methods like groupingBy.  This is an ideal balance of giving users tools 
to do their job without tying Stream to Collection.

> I think you should be at least consistent, so either we have an Exploder
> like we have a Collector,
> or we have several overloads for flatMap, groupBy and toList/toSet.

Personally, I would (fairly strongly) prefer to have only:

   Stream<U> flatMap(FlatMapper<T, U>)

and

   Stream<U> flatMap(Function<T, Stream<U>>)

One can quite easily derive the Collection (and with slightly more work, 
array) cases from the first form (or the second form, with more runtime 
overhead):

   .flatMap((t, sink) -> getColl(t).forEach(sink))
   .flatMap(t -> getColl(t).stream())

In fact, the first is what we originally had.  But then people howled 
that (a) "I can't understand flatMap" and (b) "I think flatMap should 
take a Function<T, Collection<U>>".  In our early focus groups, people 
saw the base form of FlatMap and universally cried "WTF?"  People can't 
understand it.  After 100 people make the same comment, you start to get 
that its a pain point.

So, the proposal I made today attempts to take into account that people 
are not yet ready to understand this form of flatMap, and attempts to 
compromise.  But I'll happily retreat from that, and vote for just

   Stream<U> flatMap(FlatMapper<T, U>)
   Stream<U> flatMap(Function<T, Stream<U>>)

It just seemed people weren't OK with that.  (Though to be fair, we 
didn't always have the second form, and its addition might be enough to 
avoid the need for the Collection and array forms.  It also allows 
reclaiming of the good name "flatMap", since there is actual mapping 
going on, and the generator form can piggyback on that.)

So, +1 to Remi's implicit suggestion:

   Stream<U> flatMap(FlatMapper<T, U>)
   Stream<U> flatMap(Function<T, Stream<U>>)

That's the new proposal.

Will be carved in stone in 24h unless there is further discussion :)



More information about the lambda-libs-spec-experts mailing list