groupBy / reduceBy

Brian Goetz brian.goetz at oracle.com
Sat Dec 8 12:38:40 PST 2012


So, I hate groupBy/reduceBy.  Not that I hate the idea, just their 
current realization.

Reasons to hate them:

  - They intrude Map and Collection into the Stream API, whereas 
otherwise there would be no connection (except Iterator) to Old 
Collections.  This falls short of a key goal, which is for Streams to be 
a bridge from Old Collections to New Collections in the future.  We've 
already severed the 32-bit size limitation; we've distanced ourselves 
from the pervasive mutability of Old Collections; this is the remaining 
connection that needs to be severed.

  - They are limited.  You can do one level of group-by, but you can't 
do two; it requires gymnastics to, for example, take a 
Stream<Transaction> and do a multi-level tabulation like grouping into a 
Map<Buyer, Map<Seller, Collection<Transaction>>.  At the same time, they 
offer limited control over what kind of Map to use, what kind of 
Collection to use for the values for a given grouping, etc.

  - Guava-hostile.  Guava users would probably like groupBy to return a 
Multimap.  This should be easy, but currently is not.

  - The name reduceBy is completely unclear what it does.

  - Too-limited control over whether to use map-merging (required if you 
want to preserve encounter order, but probably slower) or accumulate 
results directly into a single shared ConcurrentMap (probably faster, 
but only if you don't care about encounter order).  Currently we key off 
of having an encounter order here, but this should be a user choice, not 
a framework choice.

These negatives play into the motivation for some upcoming proposals 
about reduce forms, which will propose a new, generalized formulation 
for these methods that address these negatives.  Key observations:
  - groupBy is really just reduceBy where the reduce seed is "new 
ArrayList" and the combiner function is ArrayList::add
  - reduceBy is really just a reduce whose combiner function 
incorporates some mutable map mechanics



More information about the lambda-libs-spec-observers mailing list