Collector / Collectors

Brian Goetz brian.goetz at oracle.com
Wed Jun 26 11:26:16 PDT 2013


As promised, here are what I see as the two last loose ends in 
Collector(s), both of which came out of the spec review.

1.  Collector.of() doesn't let you set the characteristics.  This means 
you can't make a concurrent collector.

This is easily fixed by adding Characteristic... arguments to the two 
static factory methods.  There would be some slight ambiguity if it were 
legal for the finisher to be null, but a null finisher is invalid, so 
this is not a real concern.

So
     static<T, A, R> Collector<T, A, R> of(Supplier<A> supplier,
                                           BiConsumer<A, T> accumulator,
                                           BinaryOperator<A> combiner,
                                           Function<A, R> finisher) {

would become

     static<T, A, R> Collector<T, A, R> of(Supplier<A> supplier,
                                           BiConsumer<A, T> accumulator,
                                           BinaryOperator<A> combiner,
                                           Function<A, R> finisher,
                                           Characteristic... 
characteristics) {


2.  Making the one-arg reducing(), and minBy/maxBy, return Optional 
means that queries like "tallest person by city" end up with Optional in 
the value:

   Map<City, Optional<Person>> m
     = people.collect(groupingBy(Person::getCity,
                                 maxBy(comparing(Person::getHeight)));

Which is doubly bad because the optionals here will *always* be present, 
since otherwise there'd be no associated key.

I can see a few ways to address this:
  - Provide non-optional versions of minBy/maxBy/reducing, which would 
probably have to return null for "no elements".
  - Provide a way to add a finisher to a Collector, which is more 
general (and could be used, say, to turn toList() into something that 
always collects to an immutable list.)

The latter could, in turn, be done in one of two ways:

  - Add a andThen(f) method to Collector.  Then the above would read:

    groupingBy(Person::getCity,
               maxBy(comparing(Person::getHeight))
                 .andThen(Optional::get))

  - Add a combinator:

    groupingBy(Person::getCity,
               finishing(maxBy(comparing(Person::getHeight)),
                         Optional::get)));

I prefer the former because it reads better in usage; using a combinator 
function feels a little "inside out."


On 6/25/2013 4:26 PM, Brian Goetz wrote:
> Here's the review comments we got.  Most have already been dealt with.
> One additional suggestion for a new Collector (transforming) comes out
> of this.
>
>  From Mike:
>   - Need extra disclaimers on summingDouble/averagingDouble.  Done.
>
>  From Sam:
>   - Fix the examples where things have changed to Optional.  Done.  See [1]
>
>  From Joe:
>   - <code> tags leaking into output.  This turns out to be a Javadoc bug
> when you have a @link to a class that is not present.  Will not appear
> in final output.
>   - Example section repeated between Collector and Collectors.  I think
> this is OK; users may land on either first, and I think these examples
> are important.  Sadly Javadoc has no normal form.
>   - General spec nits (some addressed already, others will be addressed
> in final pass).
>
>  From Remi:
>   - I don't understand how to create a Collector instead of forEach in
> this case:
>      ConcurrentHashMap<String, LongAdder> map
>          = new ConcurrentHashMap<>();
>      Arrays.stream(args).parallel()
>            .forEach( arg -> map.computeIfAbsent(arg, k -> new
> LongAdder()).increment() );  See [2]
>
>
> [1] After changing the example:
>
>   *     Map<Department, Employee> highestPaidByDept
>   *         = employees.stream()
>   * .collect(Collectors.groupingBy(Employee::getDepartment,
>   * Collectors.maxBy(Comparators.comparing(Employee::getSalary))));
>
> to
>
>   *     Map<Department, Optional<Employee>> highestPaidByDept
>   *         = employees.stream()...
>
> this suggests we might want a combinator to add a post-function to an
> existing Collector, such as:
>
>    <T,A,R,U> Collector<T,A,U>
>              transforming(Collector<T,A,R> collector,
>                           Function<R,U> transformer)
>
> that can be used in the downstream of a groupingBy to transform away the
> Optional at the end:
>
>    groupingBy(Employee::getDepartment,
>               transforming(maxBy(...), Optional::get))
>
> THis way, the user can get a Map<Department, Employee> rather than a
> Map<Departhment, Optional<Employee>>.
>
>
> [2] This can be done, but it involves boxing.
>
> Collector<Long, LongAdder, LongAdder> c =
>    Collectors.of(LongAdder::new,
>                  (a, v) -> a.increment(),
>                  (a, b) -> /* merge LongAdder tables by key */);
>
> ConcurrentMap<String, LongAdder> map
>     = stream.collect(groupingByConcurrent(Function.identity(), c));
>
>
>
> On 6/25/2013 4:01 PM, Brian Goetz wrote:
>> I've closed the survey.  You can see responses here:
>>
>> https://www.surveymonkey.com/sr.aspx?sm=fAIHxonslEapeh3et_2biuRHRGUer22Q8awZI1yaJrsT0_3d
>>
>>
>>
>> I'll be summarizing and responding to these soon.
>>
>> On 6/18/2013 11:58 AM, Brian Goetz wrote:
>>> I think I may (finally!) have a final API candidate for Collector /
>>> Collectors.  Updated docs here:
>>>
>>>    http://cr.openjdk.java.net/~briangoetz/doctmp/doc/
>>>
>>> Salient recent changes:
>>>   - Incorporation of finishing function into Collector
>>>   - Renamings in Collector API
>>>   - Morph toStringBuilder, toStringJoiner into joining()
>>>   - Add prefix/suffix version of joining()
>>>   - Addition of summingBy{Int,Long,Double}
>>>   - Addition of averagingBy{Int,Long,Double}
>>>   - Rename toXxxSummaryStatistics to summarizingXxx
>>>   - no-see reducing, and minBy/maxBy collectors return Optional
>>>   - elimination of canned merger functions from API
>>>
>>> Please review the API and spec.  I have set up a SurveyMonkey with the
>>> usual password to collect review comments at:
>>>
>>>    https://www.surveymonkey.com/s/3TTBJ7K
>>>
>>>
>


More information about the lambda-libs-spec-experts mailing list